Dominic Smith's Homepage > CHUCOL > Electronic Texts > Markup
Faculty of Modern and Medieval Languages

markup


Sections on this page:


autres pages sur cette site...

 

Two entries from Brewer

Penny Readings Parochial entertainments, consisting of readings, music, etc., for which one penny admission is charged.

Penny Saved (A). A penny saved is twopence gained. In French, 'Un centime épargné en vant deux.'
Well, suppose a man asks twopence a piece for his oranges, and a haggler obtains hundred at a penny a piece, would he save 200 pence by his bargain? If so, let him go on spending, and he will soon become a millionaire. Or suppose, instead of paying £1,000 for a bad bet, I had not wagered any money at all, would this have been worth £2,000 to me?


About the Mark-up System

My markup system could be used in Dictionaries or Encylopediae, as well as being valid for the Phrases and Fables, because of the definition of one of these types as an attribute to the <headword> tag. It contains much information and would be designed for viewing in an especially-written browser, so that all of the information contained in the Meta tags, for example, could be used. This would allow easy searching of for document, whether by title, author or keyword and more complex searches involving publication details.

Basic Structure

The body of the text is located between the <maintext> tag, each entry is in turn defined with the <entry> tag. At least one entry must be present in any maintext area. The entry is essentially split into two: the <headword>, which is the word or phrase being defined and the <definition>. Each entry must contain both these tags.

Language Attribute

Both the headword and definition must have a language attribute, which uses ISO 639-2 to define the language, followed by a hyphen and then ISO 3166-1 to define the region. This allows for very accurate searches by language to become possible and ISO 639-2 has been used instead of the generally favoured ISO 639-1 because codes are available for historical forms such as Middle French, which would further indexing possibilities. Where equivalent phrases or words in foreign languages are given in the definition, this is indexed using the <equiv> tag, with a language attribute, again to allow easy searching by language and the potential for cross-referencing .

Keyword tag

To enable linguists to possibly search for terms by keyword, a keyword as defined by having <keyword> tags around it. This is much more accurate than using a robotic search engine to search for words in the whole corpus. There is also the possibility to define four alternative keywords derived from the word in question by using the alternative attributes alt1-alt4. Thus, parochial may be given the alternative parish and the search utility would, therefore, pick up parochial following a search for parish or vice-versa.

Entity References

In order to maintain compatibility with existing systems, and eventually to allow this form of SGML to be included as an XML definition, standard HTML entity references are used for non-ASCII characters.

Formatting

No formatting information has been included since it is assumed that a browser viewing the page would be set up to display the items as the user wishes. For example, if the user were to set up a stylesheet stipulating that all Headwords should be in brown, Arial font, size 12 and bold with center alignment, whilst Definitions should be in black, Times, size 10 and fully justified, this should be possible.


Full SGML Document

<!-- ========== DTD ========== -->
<!ENTITY % LanguageCode "NAME">

<!ENTITY % Text "CDATA">
<!ELEMENT head -- (meta) >
<!ELEMENT META - O EMPTY >
<!ELEMENT maintext -- (entry) >
<!ELEMENT entry -- (headword+, definition+) >
<!ELEMENT headword -- (#PCDATA) >
<!ELEMENT definition -- (#PCDATA) >
<!ELEMENT keyword -- (#PCDATA) >
<!ELEMENT equiv -- (#PCDATA) >
<!ATTLIST headword
type (word|phrase|fable) #REQUIRED
lang %LanguageCode #REQUIRED
note (A? | B?) #IMPLIED
>
<!ATTLIST definition
lang %LanguageCode #REQUIRED
>
<!ATTLIST equiv
lang %LanguageCode #REQUIRED
type (word|phrase|fable) #REQUIRED
>
<!ATTLIST keyword
alt1 %Text #IMPLIED
alt2 %Text #IMPLIED
alt3 %Text #IMPLIED
alt4 %Text #IMPLIED
lang %LanguageCode #IMPLIED
>
<!ATTLIST meta
source_title %Text #IMPLIED
source_auth1_surname %Text #IMPLIED
source_auth1_otherinits %Text #IMPLIED
source_auth1_dates %Text #IMPLIED
source_1st_pub_year %Text #IMPLIED
source_edition_no %Text #IMPLIED
source_revision_surname %Text #IMPLIED
source_revision_otherinits %Text #IMPLIED
source_revision_pub_year %Text #IMPLIED
source_revision_pub_name %Text #IMPLIED
source_revision_pub_place %Text #IMPLIED
source_copyright %Text #IMPLIED
sgml_auth1_surname %Text #IMPLIED
sgml_auth1_otherinits %Text #IMPLIED
sgml_auth1_dates %Text #IMPLIED
sgml_1st_pub_year %Text #IMPLIED
sgml_edition_no %Text #IMPLIED

>

<!-- ========== Header ========== -->
<head>

<meta name="keywords" content="brewer, phrase, fable, dictionary">

<meta name="source_title" content="Brewer's Dictionary of Phrase and Fable">
<meta name="source_auth1_surname" content="Brewer">
<meta name="source_auth1_otherinits" content="E. C.">

<meta name="source_auth1_dates" content="1810-1897">

<meta name="source_1st_pub_year" content="1894">

<meta name="source_edition_no" content="15">

<meta name="source_revision_surname" content="Room">
<meta name="source_revision_otherinits" content="A.">
<meta name="source_revision_pub_year" content="1995">

<meta name="source_revision_pub_name" content="Cassell">

<meta name="source_revision_pub_place" content="London">
<meta name="source_copyright" content="Cassell, 1995">

<meta name="source_pages_no" content="1182">
<meta name="sgml_auth1_surname" content="Smith">

<meta name="sgml_auth1_otherinits" content="D. N. A.">
<meta name="sgml_auth1_dates" content="1981 - ">

<meta name="sgml_1st_pub_year" content="2002">

<meta name="sgml_edition_no" content="1">

</head>

<!-- ========== Body ========== -->
<maintext>

<entry> <headword type="phrase" lang="eng-gb"> Penny Readings </headword>
<definition lang="
eng-gb"> <keyword alt1="parish"> Parochial </keyword> <keyword> entertainment</keyword>s, consisting of <keyword> reading</keyword>s, <keyword> music </keyword>, etc., for which one <keyword alt1="pence" alt2="p"> penny </keyword> admission is charged. </definition>
</entry>

<entry> <headword type="phrase" lang="eng-gb" note="A"> Penny Saved </headword>
<definition lang="
eng-gb"> A <keyword alt1="pence" alt2="p"> penny </keyword> saved is <keyword alt1="two+pence" alt2="2p" alt3="2d" alt4="two+pennies"> twopence </keyword> gained. In French, <equiv type="phrase" lang="fre-fr"> Un <keyword> centime </keyword> <keyword> &eacute;pargn&eacute; </keyword> en vant deux. </equiv> Well, suppose a man asks twopence a piece for his oranges, and a haggler obtains hundred at a penny a piece, would he save 200 pence by his bargain? If so, let him go on spending, and he will soon become a millionaire. Or suppose, instead of paying &pound;1000 for a bad bet, I had not wagered any money at all, would this have been worth &pound;2000 to me? </definition> </entry>
</maintext>

<!-- ========== End ========== -->


An interpretation in XML

XML Code


<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="diction.xsl"?>
<!DOCTYPE dict [
<!ELEMENT head (meta) >
<!ELEMENT META EMPTY >
<!ELEMENT body (meta, entry) >
<!ELEMENT entry (headword+, definition+) >
<!ELEMENT headword (#PCDATA) >
<!ELEMENT definition (#PCDATA) >
<!ELEMENT keyword (#PCDATA) >
<!ELEMENT equiv (#PCDATA) >
<!ATTLIST headword
type (word|phrase|fable) #REQUIRED
lang ID #REQUIRED
note (A | B) #IMPLIED
>
<!ATTLIST definition
lang ID #REQUIRED
>
<!ATTLIST equiv
lang ID #REQUIRED
type (word|phrase|fable) #REQUIRED
>
<!ATTLIST keyword
alt1 CDATA #IMPLIED
alt2 CDATA #IMPLIED
alt3 CDATA #IMPLIED
alt4 CDATA #IMPLIED
lang ID #IMPLIED
>
<!ATTLIST meta
source_title CDATA #IMPLIED
source_auth1_surname CDATA #IMPLIED
source_auth1_otherinits CDATA #IMPLIED
source_auth1_dates CDATA #IMPLIED
source_1st_pub_year CDATA #IMPLIED
source_edition_no CDATA #IMPLIED
source_revision_surname CDATA #IMPLIED
source_revision_otherinits CDATA #IMPLIED
source_revision_pub_year CDATA #IMPLIED
source_revision_pub_name CDATA #IMPLIED
source_revision_pub_place CDATA #IMPLIED
source_copyright CDATA #IMPLIED
sgml_auth1_surname CDATA #IMPLIED
sgml_auth1_otherinits CDATA #IMPLIED
sgml_auth1_dates CDATA #IMPLIED
sgml_1st_pub_year CDATA #IMPLIED
sgml_edition_no CDATA #IMPLIED
>
]>
<body>
<meta name="keywords" content="brewer, phrase, fable, dictionary" />
<meta name="source_title" content="Brewer's Dictionary of Phrase and Fable" />
<meta name="source_auth1_surname" content="Brewer" />
<meta name="source_auth1_otherinits" content="E. C." />
<meta name="source_auth1_dates" content="1810-1897" />
<meta name="source_1st_pub_year" content="1894" />
<meta name="source_edition_no" content="15" />
<meta name="source_revision_surname" content="Room" />
<meta name="source_revision_otherinits" content="A." />
<meta name="source_revision_pub_year" content="1995" />
<meta name="source_revision_pub_name" content="Cassell" />
<meta name="source_revision_pub_place" content="London" />
<meta name="source_copyright" content="Cassell, 1995" />
<meta name="source_pages_no" content="1182" />
<meta name="sgml_auth1_surname" content="Smith" />
<meta name="sgml_auth1_otherinits" content="D. N. A." />
<meta name="sgml_auth1_dates" content="1981 - " />
<meta name="sgml_1st_pub_year" content="2002" />
<meta name="sgml_edition_no" content="1" />
<entry>
<headword type="phrase" lang="eng-gb"> Penny Readings </headword>
<definition lang="eng-gb"> <keyword alt1="parish"> Parochial </keyword> <keyword> entertainment</keyword>s, consisting of <keyword> reading</keyword>s, <keyword> music </keyword>, etc., for which one <keyword alt1="pence" alt2="p"> penny </keyword> admission is charged. </definition>
</entry>
<entry>
<headword type="phrase" lang="eng-gb" note="A"> Penny Saved </headword>
<definition lang="eng-gb"> A <keyword alt1="pence" alt2="p"> penny </keyword> saved is <keyword alt1="two+pence" alt2="2p" alt3="2d" alt4="two pennies"> twopence </keyword> gained.
In French, <equiv type="phrase" lang="fre-fr"> Un <keyword> centime </keyword> <keyword> épargné </keyword> en vant deux. </equiv>
Well, suppose a man asks twopence a piece for his oranges, and a haggler obtains hundred at a penny a piece, would he save 200 pence by his bargain?
If so, let him go on spending, and he will soon become a millionaire. Or suppose, instead of paying £1000 for a bad bet, I had not wagered any money at all, would this have been worth £2000 to me? </definition>
</entry>
</body>

XSL Code

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<head>
<title>Dictionary of Phrase and Fable</title>
<style>
h1 {font-family:'Verdana, Arial, sans-serif'; font-size:16pt; color:#000000}
h2 {font-family:'Verdana, Arial, sans-serif'; font-size:16pt; color:#FF0000}
p {font-family:'Verdana, Arial, sans-serif'; font-size:12pt; color:#550055}
</style>
</head>
<body>
<h1>Dictionary of Phrase & Fable</h1>
<xsl:for-each select="body/entry">
<h2>
<xsl:value-of select="headword"/></h2>
<p><xsl:value-of select="definition"/></p>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Link to XML version

Click here to view the XML version produced with the above code. Note that this will only work in Internet Explorer 6 because XSL support in Netscape 6 and Internet Explorer 5 and 5.5 was based upon a draft recommendation by W3C which has since been changed. See here for more details. Due to limitations in XSL at present, it is not possible to format the quotation differently from the rest of the definition.

 


Last updated: 17 April 2002
© Dominic Smith
Email: dom@domsmith.co.uk
Valid HTML 4.01!
May be viewed correctly in any browser