How to create a WWW Home Page

This document gives a brief description of how to create HTML documents, in particular, your own home page. It describes all of the basic commands you will need, and includes examples of all the commands discussed. One small section of the document is specific to the Fachhochschule Furtwangen--which section is obvious. The rest is general, and can be used by anyone who wants to produce a web-page.

This document presents the syntax of the various HTML commands very briefly, together with examples of their usage. You can also look at the actual HTML commands used to produce this document by selecting the "View Source" command on your net browser.

With one exception, I have tried to write this document using standard HTML version 2.0 commands. The exception is that I have used tables, which are widely supported but not a part of the standard. If you like, you can have a look at the official HTML 2.0 reference manual.

It is important to note that HTML 2.0 is (as of this writing, 28 May 1996) the latest approved standard. Lots of web browsers implement lots of nifty features: tables, panes, scrolling messages, etc, etc. But if you use the fancy features of your particular web browser, your page may be incomprehensible to people using other browsers.


Create an HTML File

To create a WWW Page, you create an HTML file. HTML is an abbreviation for "hyper-text markup language"; a very impressive-sounding name for a very simple idea. An HTML file is an ordinary text file which you can create with any editor. You place this file in the directory designated by your system administrator, and it will be available on the web. The specific procedure depends on your site.


Here at the Furtwangen AI-lab

The way this works here in the computer science lab in Furtwagen is as follows:

What goes in an HTML File

An HTML file is an ordinary text file, which you can create with any text editor. Note that word processors such as Word or WordPerfect are only suitable if you save the file in plain text format.


HTML Syntax

Everything in an HTML document is specified using commands, and all commands have the same basic syntax. The syntax of a command is:

<COMMAND [Options]>
TEXT
</COMMAND>

The command name appears inside angle-brackets, possibly with some options. Then comes the text which is to be affected by the command. The command ends with another pair of angle brackets containing a slash and the command name.

The line-breaks are optional--except as noted below for specific commands (like the PRE command), line-breaks are completely ignored in HTML. Note also that case is ignored. You can write either "<COMMAND>" or "<command>".

Where it makes sense, you can nest commands inside each other, to any desired depth. Also, web browsers are very forgiving of syntax errors; if you've made a mistake, your command will probably just be ignored. The same happens if you use a command that isn't supported by older web browsers; the browsers just ignore commands that they don't understand.

On the other hand, you shouldn't try to use complicated formatting. First, you want your text to still be readable if some of your commands aren't processed. Second, most web-browsers are pretty stupid in the way they do formatting, and different browsers maz display your document differently.

For example, to emphasize text, you put it in italics. If you want to emphasize text when the current font is already italicized, you put the text back into the normal font. Most web browsers don't know this.

In short, stick to very basic formatting. If you really need to do something fancy, look at your document on a variety web browsers to see if it looks the way you think it should.


HTML Document Structure

An HTML file is delimited by the "HTML" command. Within this there is a header and a body. The header identifies the document, and includes information about it. Usually the only information included is the document's title.

The body contains the text of the document. The text is linear, and may include headings, lists, paragraphs, and other structures. By "linear" I mean that the items in the body are displayed by a web-browser in the same order that they appear in the file. If you want to have a non-linear organization, you have to do this by using links and multiple files.

Hence, the overall structure of an HTML document looks like this:

<HTML>
<HEAD>
Header information
</HEAD>
<BODY>
Body text
</BODY>
</HTML>

The Header

The header gives general information about the document. This information is used by web-browsers when the document is accessed. It is not displayed as part of the document.

Usually the only information given is the title of the document, and this is required. The title of a document is what a browser uses to identify a document to the user. For example, the document title will be displayed in a "bookmark" list or a "goto" list.

This means that the title of a document should be meaningful without any knowledge of where the document is located or how it relates to other documents. For example, the title "home page" is useless. If you saw this in your list of bookmarks, you would have no idea whose home page it was. The same goes for titles like "introduction", "interesting links", and so forth.

The title command takes no options. So it looks like

<TITLE>Document Title</TITLE>

For an example, look at the title of this document, using the "View source" option on your browser.


The Body

The body of a document is just a linear series of HTML commands surrounding bits of text. Here are the things that you can place in the body of an HTML document:


Headers

Not the same thing as the document header, these are what you use to title the sections and subsections of your document. In this document, all of the titles shown in bold, large print are produced using header commands. The header command takes no options, so it looks like:
<H?>Section Title</H?>

where the '?' is replaced by a number from 1 to 6. The "H1" header is the largest, and the "H6" header is the smallest. Generally speaking, you should not use header levels 4 to 6, as they are displayed in a font smaller than the document text. This is not only ugly, but sometimes unreadable at the default settings of some web browsers.

Here are samples of all six header sizes:

Chapter 3

Section 3.1

Subsection 3.1.1

Subsubsection 3.1.1a

Subsubsubsection 3.1.1a(1)
Subsubsubsubsection 3.1.1a(1a)

Blocks of Text

The most basic element in the body of an HTML document is a block of text. And the most basic type of text-block is the paragraph, which is delimited with the "P" command and does not require any options:

<P> Paragraph text </P>

Although the "P" command does not require options, you can specify some. For example, this paragraph was centered using the option "align=center".

<P ALIGN=CENTER> Paragraph text </P>

One last thing about the paragraph command. Often you will have many paragraphs, one after the other. To save typing, you can omit the closing </P> when the next item in the HTML file is a <P>. So, for example, the following two bits of HTML produce exactly the same results:

<P>
Text of paragraph 1.
</P>
<P>
Text of paragraph 2.
</P>
<P>
Text of paragraph 1.
<P>
Text of paragraph 2.
</P>

There are a number of other types of text-blocks that you can include in your document. The main ones are:

Pre-formatted text. With normal paragraphs, browsers ignore the line-breaks in the source file. Lines are broken according to how wide the display is. You can see this by resizing the window of your browser; text is reformatted to suit the new width of the window.

With pre-formatted text, the browser must display the text with exactly the line-breaks that appear in the source file. For example, the following text is declared as pre-formatted:

This is short line.
With pre-formatted test, we control the line-breaks, even if this means that lines are very short or very long.
The command for pre-formatted text is:
<PRE>
Text goes here
</PRE>

Block quotes. A block quote command causes text to be indented from both the left and the right. The following paragraph is a block quote:

This is a block quote. It is indented from both sides, just like you would see in a book. Block quotes are generally used when including a long piece of text, which you are quoting from another document.
The command for a block quote is:
<BLOCKQUOTE>
Text goes here
</BLOCKQUOTE>

Links

Links are what distinguish HTML documents from plain text documents. A link is a reference to to other information on the world-wide web. There are many different types of links, but the basic syntax is the same for all of them:

<A HREF=reference>
Text to be highlighted goes here
</A>
The contents of reference depend on the type of link, as described below. The "text to be highlighted" is the text displayed to the reader, on which the reader can click to follow the link. For example, here is a reference to my home page. This reference was produced using the command
<A HREF="http://www.ai-lab.fh-furtwangen.de:80/~bradley/">my home page</A>

The words "my home page" are highlighted on your screen, because these are the words included inside the "A" command. By the way, "A" stands for "anchor", where an "anchor" is what a link refers to.

Here are three of the most important types of links:

Links to other documents. Links to other documents on the web must give the full address of the document. A full web address always begins with "http://www".

Links to local documents. If you have created several web documents, all of which reside in the same directory, they can reference one another just by using their respective file-names. For example, this document resides in the same directory as my home page, which is in a file called "index.html". Here is another reference to my home page, this time created with the command

<A HREF="index.html">my home page</A>

Links to places within the same document. If you have created a very long document, it is common to place a table of contents at the front. This table of contents contains links to specific places within the document, so that the reader can quickly go the the part he or she is interested in.

A link of this form also requires you to explicitly place an anchor in your document, identifying where the link goes. Here is an example of a link/anchor pair:

The link: <A HREF="#anchor name">highlighted text</A>
The anchor: <A NAME="anchor name">Text to link to</A>

Most commonly, the link points to a section title somewhere in the document, for example:

<A NAME="anchor name">
<H2>Section Title</H2>
</A>

Including Graphics

You can include graphics in your document with the "IMG" command. However, many people have relatively slow links to the web, and have their browsers set so that graphics are not displayed. They can click on particular graphics if they really want to see them. This means that you should include information about the pictures in your document, so that the reader can decide whether or not to look at them. And--very important--your document should not depend on the user seeing the graphics.

The format of the IMG command is:

<IMG SRC="image-file" ALT="image-description">

You replace image-file with the name of the file containing the image, and image-description with a few words telling what the picture is. If someone reading your page does not automatically download images, then this description is what they will see in place of the picture.

The most common image type on the web is a GIF file. GIF files are stored in a compressed format, which means that they can be transferred fairly quickly across the web. Note that GIF files must be stored in a file whose name ends with ".gif".

The "IMG" command takes a number of options which specify the relationship of the image to the surrounding text. The most frequently used option is "ALIGN", which is used for two purposes. First, it specifies whether the adjacent text is aligned with the top, bottom, or center of the image. Second, it specifies whether the image itself is aligned with the left-side, right-side, or center of the reader's window. Here are a few examples:

FHF No alignment option specified.

 

FHF Option "align=top" specified.

 

FHF Option "align=right" specified.

  Look at the source of this document to see how the options are specified.


Lists

The two most common types of lists are the numbered list and the unnumbered list.

Ordered lists. An ordered list produced elements preceded by numbers. The list itself is delimited with the "OL" command ("OL" stands for "ordered list"). Individual list items are preceded with the "LI" command; as with the paragraph command "P", list items need not be followed by a closing "</LI>", since it is perfectly obvious when a list element ends.

Here is an example of a numbered list, and the commands that produced it:
  1. The first item.
  2. The second item.
  3. The third item.
<ol>
<li>The first item.
<li>The second item.
<li>The third item.
</ol>

Unordered lists. Elements in unordered lists are preceded by bullets. What bullet symbol is used depends on the nesting of the list (see the following section for examples of nested lists).

An unordered list is delimited by the "UL" command. List items are preceded by the "LI" command, just as with ordered lists. Here is an example of an unordered list, and the commands that produced it:
  • The first item.
  • The second item.
  • The third item.
<ul>
<li>The first item.
<li>The second item.
<li>The third item.
</ul>

Nested lists. Both types of lists can be nested, either within a list of the same type or within a list of the other type. Here are four examples:

Unordered lists inside an unordered list.
  • This is item 1
    • Item 1a
    • Item 1b
  • This is item 2
    • Item 2a
    • Item 2b
Ordered lists inside an unordered list.
  • This is item 1
    1. Item 1a
    2. Item 1b
  • This is item 2
    1. Item 2a
    2. Item 2b
Unordered lists inside an ordered list.
  1. This is item 1
    • Item 1a
    • Item 1b
  2. This is item 2
    • Item 2a
    • Item 2b
Ordered lists inside an ordered list.
  1. This is item 1
    1. Item 1a
    2. Item 1b
  2. This is item 2
    1. Item 2a
    2. Item 2b

Emphasizing Text

There are two main ways of emphasizing text: bold and italics.

Bold text. Bold text is delimited by the "strong" command, as in
Here is bold text Here is <strong>bold</strong> text

Italicized text. Normally, to emphasize the importance of a few words, one places them in italics. If one is already in italics, one switches to non-italicized text for emphasis. This switch between italics and normal text is handled by the "EM" command, which stands for "emphasize". For example:
Here is emphasized text Here is <em>emphasized</em> text

All web browsers can italicize text for emphasis, However, many do not realize that they should switch to the non-italicized font if the current font is italics. The following two pieces of text should look identical, but they probably don't:
Here is emphasized text Here is emphasized text


Special Commands

There are a number of miscellaneous commands. Two of the more common ones are

Break. The break command "BR" forces a line-break to appear in the text. This can be useful if you need to have one or two line-breaks in the text, but do not want to go so far as to use pre-formatted text (where you have to specify all of the line-breaks). Here is an example, along with the HTML that produced it:
Here is a line
break
Here is a line<BR>break

Horizontal rule. It is sometimes nice to draw a horizontal line across the document, to visually separate sections. Such a line is called a "horizontal rule", and is produced with the command:

<HR>

You can see examples of this command between all of the sections of this document.


Special and International Characters

There are many characters which one cannot (or, at least, should not) directly enter into an HTML document. For example, when an HTML command, like <P>, is directly displayed in this document, the < and > characters cannot be directly entered, or else the command would be executed rather than just displayed. Also problematical are international characters used in languages other than English. For example, in German, one needs characters with umlauts (e.g., ü, ä), in French one needs a variety of accented characters (e.g., ç, é), and so forth. All of these characters can be used in an HTML document by means of a special syntax:

&code;

The code identifies the desired character. Here are some of the more commonly used characters and their HTML equivalents:

 
< &lt; > &gt; & &amp;
" &quot; ä &auml; Ä &Auml;
á &aacute; Á &Aacute; à &agrave;
À &Agrave; ç &ccedil; Ç &Ccedil;
é &eacute; É &Eacute; è &egrave;
È &Egrave; í &iacute; Í &Iacute;
ì &igrave; Ì &Igrave; ö &ouml;
Ö &Ouml; ó &oacute; Ó &Oacute;
ò &ograve; Ò &Ograve; ß &szlig;
ü &uuml; Ü &Uuml; ú &uacute;
Ú &Uacute; ù &ugrave; Ù &Ugrave;

Follow this link to see the# complete list of characters from the HTML 2.0 standard.