Defining and identifying the roles of geographic references within text: Examples from the Great Britain Historical GIS project
نویسنده
چکیده
Reliably recognizing, disambiguating, normalizing, storing, and displaying geographic names poses many challenges. However, associating each name with a geographical point location cannot be the final stage. We also need to understand each name’s role within the document, and its association with adjacent text. The paper develops these points through a discussion of two different types of historical texts, both rich in geographic names: descriptive gazetteer entries and travellers’ narratives. It concludes by discussing the limitations of existing mark-up systems in this area. 1 The Great Britain Historical GIS The Great Britain Historical GIS is a very large assembly of historical information about Britain, all in some sense tied to particular places. The earliest data in the system was computerised in the late 1970s, and it was established as a relational database in 1989-91. Until recently, however, almost our entire content was either statistical or locational: by now, we have computerised or acquired from collaborators a substantial fraction of the information published in the reports of the Censuses of Population for England and Wales, and for Scotland; and of the information published in vital registration reports for the same areas since the 1840s. In general, our coverage ends in the early 1970s, when the relevant information began to be published in digital form. Our statistical database by now comprises over 33m. data values, and is closely linked to digital mapping containing the changing boundaries of the various statistical reporting units: counties, various types of district, and approaching 20,000 parishes. This material has formed the basis for studies of demographic, economic and social change. However, our largest source of current funding has a different focus. A grant from part of the UK National Lottery is turning the GBH GIS into an on-line resource for ‘life-long learners’, which in practice means the general public. Our system is not a conventional on-line GIS as our most obvious audience are people interested in local history: the most basic functionality of our site allows users to specify a location by a place-name or, preferably, a postal code which in the UK identifies a group of maybe ten houses, and therefore a fairly precise location; they will then be taken to a page providing information on how their current local authority – there are 408 in Great Britain – has changed over the last 200 years, with the option to drill deeper by accessing information for the various past administrative units which contained the location they specified. An initial system, limited to the above functions for the modern units, should go live during May 2003: www.VisionOfBritain.org.uk Later versions of the site will contain vastly greater content. We will provide access to data for the original historical units, using ‘point-in-polygon’ searching of a spatial database to identify relevant units, having first converted the users’ postal code into a geographic coordinate. The site will draw on both vector mapping of historic boundaries and two complete set of georeferenced image scans of historic maps of Great Britain at one inch-to-the-mile scale: Firstly, we are scanning two complete editions of Ordnance Survey One Inch-tothe-Mile maps of Great Britain: the New Popular Edition, published in the late 1940s and the first to include National Grid lines, simplifying geo-referencing all the scanned maps; and the nineteenth century First Series. Geo-referencing the latter will be challenging, and require extensive ‘rubber sheeting’. We are computerising two existing inventories of historical administrative units covering England and Wales, and constructing an equivalent digital resource covering Scotland. These inventories are not gazetteers but systematic lists of all the counties, parishes and various kinds of districts that ever existed, with their hierarchic relationships and some information on variant names. This information provides the absolute core of our system, structured as an ontology rather than as a strictly hierarchic thesaurus. The core ontology does not require locational information for units, but if available locations are stored as polygons representing the boundaries, with dates of creation and abolition. In our final system, we will be able to offer ‘home pages’ not just for the 408 modern districts but for over 20,000 historic units, including the parishes which were the lowest level of administration until recently, and generally correspond with individual villages. Each home page will contain a map showing the overall location within Britain and a short description generated from the database and highlighting key statistics. From the home page, users will be able to access a more local map showing the unit’s boundary, a range of statistics mostly presented graphically, and information on the unit’s history including boundary changes and hierarchic relationships. Relative to the overall size and scope of the site, our own capacity to author descriptive and explanatory text is limited. We will concentrate on the text to accompany maps showing national patterns. However, a site that was largely limited to statistics, even presented as maps and graphs, would be pretty boring and we are therefore computerising a large quantity of text from existing publications. This text forms the main subject of this paper. There are in fact three types of text. Firstly, we are computerising the introductions to all the census reports between 1801 and 1851, to provide a description of the country as a whole. This aspect of the project is not further discussed here. Secondly, we are computerising three descriptive gazetteers published in the late nineteenth century, totalling over 4,000 pages and containing about 5m. words: • John Bartholomew’s Gazetteer of the British Isles (Edinburgh, 1887). This covers the whole British Isles, including Ireland. • John Goring’s Imperial Gazetteer of England and Wales (Edinburgh, 6 vols., 1870-72) • Frances Groome’s Ordnance Gazetteer of Scotland, (Edinburgh, 6 vols., 1882-85). Our work on this is a collaboration with the Gazetteer for Scotland project. Even with the gazetteer text, it will be very easy for users to locate information about very specific places, but much harder for them to move around the system to explore the relationships between places, and form a ‘vision of Britain through time’ as a whole. This justifies our third and final type of new content: narratives describing historical journeys around Britain. We are computerising four well known accounts, as well as some shorter accounts written by radical agitators as they moved around in the mid-19 th century: • William Cobbett, Rural Rides (London, 1830). • Daniel Defoe, A Journey through the whole island of Britain divided into circuits or journies (London, 1724-7). • Celia Fiennes, Through England on a side saddle in the time of William and Mary, being the diary of Celia Fiennes (London, 1888). • Arthur Young, Tours in England and Wales, Selected from the Annals of agriculture (London, 1784-98). This list has been deliberately kept short, as we almost certainly have the capacity to digitise more books via our Optical Character Recognition system but not necessarily to mark them up. Three obvious additions would be the journals of John Wesley, the Torrington Diaries and Boswell and Johnson’s tour of the Hebrides. 2 Descriptive Gazetteers The descriptive gazetteers form a very large body of text, but fortunately they are highly structured, making automated parsing feasible. The parsing software runs within our Oracle database and is written in SQL and PL/SQL. I have no doubt that it would be both more efficient as well as more effective if it were written in, say, Perl. Little more will be said about the software, other than to note that its relative effectiveness is mainly evidence of the vital importance of having a large database of placenames already built. Although we are working with three different books, all are written to a broadly similar formula: • Each consists of alphabetically arranged entries; each entry begins with a head-word, i.e. the place-name usually in bold or upper case letters. • The head-word is followed by an indication of the feature type (‘a parish’, ‘a river’, etc). • Third comes some indication of where the feature is, which almost always indicates a county, sometimes a relative location (‘9 miles SW of Worcester’) and never an absolute location such as latitude and longitude. The main differences between the books is that Bartholomew’s consists of a very large number of short entries while the Imperial Gazetteer and Groome’s provide longer entries, those for major cities and counties covering several pages. Mostly, however, we focus on the first sentence as outlined above. Here are some samples, firstly from the very beginning of Bartholomew’s: • A'an, or Avon, lake, S. Banffshire, among the Cairngorm mountains, 1_ mile long, at alt. of 2250 ft.; it is the head-water of river Avon: which see. • Aasleagh, place, co. Mayo, 16 m. S. of Westport; P.O. • Abbas and Temple Combe, par., mid. Somerset, 4miles S. of Wincanton sta., 1850 ac., pop. 590. • Abbenhall. See ABENHALL. • Abberley, par. and seat, W. Worcestershire, 4 miles SW. of Stourport sta., 2636 ac., pop. 605; P.O. • Abbert, seat, 10 miles NE. of Athenry, co. Galway. • Abbertoft, hamlet, Willoughby par., mid. Lincolnshire, 2 miles SE. of Alford. • Abberton.—par., E. Essex, on Roman road, 4 milesS. of Colchester, 1068 ac., pop. 244; P.O.—2. Abberton, par. and seat, E. Worcestershire, on river Piddle, 4miles NE. of Pershore sta., 1001 ac., pop. 92. • Abberwick, township, Edlingham par., N. Northumberland, on river Alne, 3 miles W. of Alnwick, 1680 ac., pop. 109. • Abbess Roding. See ABBOTS ROOTHING. • Abbethune, seat, 1 m. from Inverkeilor sta., Forfarsh. Secondly, from the Imperial Gazetteer: • AFTON , a village 2 miles S of Yarmouth, Isle of Wight. Afton House adjoins it, on a pleasant slope toward the Yar. Afton Down rises in the south-eastern neighbourhood, overhangs the English Channel, has an altitude of about 500 feet, and is crowned by tumuli. • BINSTEAD, a small village and a parish in the Isle of Wight. The village stands on the coast of the Solent, amid charming environs, 1_ mile W by N of Ryde. The parish comprises 1,140 acres of land and 335 of water; and its post-town is Ryde. Real property, £2,775. Pop., 486. Houses, 105. The manor belonged, at the Conquest, to William Fitz-Stur; and passed to the Bishops of Winchester. Several picturesque villas, one of them belonging to Lord Downes, stand near the village and on the coast. Quarr Abbey House is the seat of Admiral Sir Thomas J. Cochrane. Remains of a Cistertian Abbey, called Quarr Abbey, founded in 1132, by Baldwin de Redvers, afterwards Earl of Devon, stand at a farmstead, 5 furlongs west of the village; and, though fragmentary and mutilated, show some interesting features. A siliceous limestone, containing many fossils, and well suited for building, has been extensively quarried since at least the time of William Rufus. The living is a rectory in the diocese of Winchester. Value, £80.* Patron, the Bishop of Winchester. The church was rebuilt in 1842; is in the early English style; and embodies some sculptured stones of a previous Norman edifice. • BRAMBLE CHINE, a small ravine on the NW coast of the Isle of Wight; at Colwell bay, 2 miles SW of Yarmouth. A thick bed of oyster shells, in a fossil state, is here; the shells in the same position as in life, but entirely decomposed. • CALBOURNE, a village, a parish, and a sub-district in the Isle of Wight. The village stands 5 miles WSW of Newport; and-has a post-office under Newport. The parish includes also Newtown borough; and extends from Brixton Down to the Solent. Acres, 6,397; of which 265 are water. Real property, £4,471. Pop., 728. Houses, 145. The property is divided among a few. Westover manor belonged to the Esturs; passed to the Lisles and the Holmeses; and belongs now to the eldest son of Lord Heytesbury, in right of his wife, the daughter of the late Sir Leonard W. Holmes. The house on it is modern; and the grounds are tasteful. Calbourne Bottom, 1_ mile SSW of the village, is a depression between Brixton and Moltestone downs. The living is a rectory, united with the p. curacy of Newtown, in the diocese of Winchester. Value, £675.* Patron, the Bishop of Winchester. The church is early English, much modernized; and has a brass of 1480.—The subdistrict contains eight parishes. Acres, 25,050. Pop., 5,417. Houses, 1,071. • WIGHT (Isle of), an island in Hants; bounded, on the N, by the Solent,–on the other sides, by the English channel. Its outline is irregularly rhomboidal, and has been compared to that of a turbot, and to that of a bird with expanded wings. Its length from E to W, from Bembridge Point to the Needles, is nearly 23 miles; its greatest breadth from N to S, from West Cowes to St. Catherine’s Point, is 13_ miles; its circuit is about 56 miles; and its area, inclusive of foreshore, is 99,746 acres. The general surface has a considerable elevation above sea-level. The coast, along the N, is low; around the W angle, is rocky, broken, precipitous, and romantic; and along the SW, the S, and the SE, breaks down in a richly varied series of cliffs, often abrupt or mural, extensively terraced and lofty, including all the magnificent range known as the Undercliff, and everywhere replete with scenic interest. The water-shed uniformly follows the trending of the S coast; and is distant from it never more than 2_ miles, generally less than 1 mile. A range of downs extends about 6 miles from St. Catherine’s Hill to Dunnose; rises from the shore, with excessive steepness, to a height of nearly 800 feet; and is marked, along its steep sea-front, with the picturesque terraces of the Undercliff. A diversified range of downs extends about 22 miles, from the Needles on the W to Culver cliff on the E; commences in grand cliffs about 600 feet high; runs 9 miles nearly due east, in a single, sharp, steep ridge, to Mottiston; attains there its highest altitude, at 662 feet above sea-level; makes several debouches in its subsequent progress; suffers repeated cleaving and disseverment, in the form of gaps or depressions; assumes, for some distance, in the neighbourhood of Carisbrooke, the character of a double or a triple range; is, in some parts of its course, saddle-shaped and slender,–in other parts, broad-based and moundish; and divides the island into two pretty nearly equal sections. A transverse ridge, about 400 feet high, extends about 3 miles in the contiguous to the river Yar; and another transverse ridge, tame in feature, but sometimes of considerable height, extends between the Medina and the Brading. The rest of the surface is either undulating or gently sloping, and has little or no claim to be called picturesque. The chief streams are the Yar, the Newton, the Medina, the Wooton, and the Main or Brading. The geognostic structure comprises chiefly lower greensand in most of the S, chalk in part of the centre, and upper eocene in most of the N; but includes many details, possesses deep interest, and may advantageously be studied with the aid of Mantell’s and Martin’s manuals. [just the first paragraph of a long entry] The second example from the Imperial Gazetteer demonstrates the main reason we need to do a limited amount of work on the whole entry, not just the first sentence. For Binstead, the feature type clause is ‘a small village and a parish’, meaning that the place-name is associated with more than one entity. In extreme cases, a single entry covers four distinct entities, so for example Ledbury in Herefordshire was described as being ‘a small town, a parish, a sub-district and a district’. The last three of these terms are all distinct entities within our ontology, and the entries for such places are in fact divided into a series of sections, each beginning with the type of sub-entry. For Binstead, the first part begins ‘The village’ and is just a single sentence; the second part begins ‘The parish’. The texts begin by being scanned in by a specialised Optical Character Recognition system optimised for historic materials, operated by our team based with the Centre for Data Digitisation and Analysis at the Queen’s University Belfast. The OCR output is then visually scanned and tidied up by Information Technology trainees there, and delivered to the project’s main team as Microsoft Word files replicating the source documents as closely as possible. The first stage in our work is breaking the text down into the individual entries, each of which is then loaded as a separate record into our database. The way the text so clearly divides into such discrete sections greatly simplifies how we handle it. One consequence is that, for now, we are not applying any mark-up system to the text itself, other than basic HTML tags to preserve basic formatting, such as bold and italics. Instead, additional structure and search facilities are provided by adding additional columns to the table. What follows describes the parsing process for entries from Bartholomew’s: • Firstly, the end of the head word is identified simply from it being in bold face, and the head word is copied into another column. NB with the gazetteers, identifying the most important geographical name within the text is fairly trivial. • Entries which are cross-references are identified from their containing specific phrases immediately after the head word: ‘See’, ‘also called XXX, which see’, ‘another name of XXX, which see’ and, at present, ‘Welsh name of XXX, which see’. The system then searches for the cross-referenced name elsewhere in the table. • The system then tries to identify the feature type by brute force methods: all the strings immediately following the headword in the first two thousand entries were extracted and sorted, and the section identifying the feature type isolated to give 410 distinct type strings. Each of these was then marked up firstly with a version of itself in which all abbreviations were expanded, and secondly with three flags indicating whether the type indicated the entry was for a county, a parish or a borough. Longer term, these ‘original’ feature types will be mapped onto the Alexandria Digital Library Gazetteer Feature Type thesaurus. • The next major stage is the identification of the county, which almost every entry includes. Our core ontology contains a complete list of all counties in the British Isles, together with some variant names. We know that one of these names must appear in each entry, so brute force methods are used to find them, starting with the first clause of the first sentence, then looking at the second clause and so on up to the eighth. • A similar method is used to find parish names, looking for various text strings which indicate a reference to a parish, such as ‘and forming part of XXX’ or ‘the hamlet is in XXX’. NB as we have already identified the county, the system matches only parish names in the correct county. What current procedures do not do is systematically associate each entry with a point location. This may not be necessary. Almost every single entry is associated with a county defined within our ontology, and the county will be associated with a polygon. Further, a great many entries are also associated with parishes, or actually are entries for a parish: each parish covers a few square miles, which is sufficiently precise for many purposes. However, we are examining methods for linking each entry with a single grid reference. Three approaches are possible: • In principle, it would be possible to parse the data on relative locations within the entries, such as ‘16 m. S. of Westport’ or ‘2 miles SW of Yarmouth’. • In practice, we will first attempt to add locations by running the entries against a modern gazetteer. One possibility is to use the Geo-X-walk gazetteer constructed by collaborators at EDINA and query it via the ADL gazetteer service protocol, using our county polygons to specify the area to be searched. • As always, neither method is likely to be totally effective, and a small number of entries could be located manually. So far, we are not trying to systematically extract other information from the entries. Although parish and district entries contain certain items of statistical data in a largely predictable way, this is almost all taken from census reports which we have separately computerised. For example, the district entries always including data on numbers attending each type of church, but this comes from the report of the 1851 census of religion, which we also hold. We have built a demonstration system containing a variety of information for the Isle of Wight, an island and small county just off the south coast of England. This includes all relevant entries from the Imperial Gazetteer, and the system also includes the first 1,720 entries from Batholomew’s. Of these, 54 are crossreferences to other entries, and of the remaining 1,666 the parsing software has associated 1,609 (97%) with counties. This system can be accessed on-line at: http://www.gbhgis.org/demo_gaz.htm Although this prototype currently lacks mapping capabilities, it does show the high level of integration we have been able to achieve between the descriptive gazetteers and other content. Summing up this discussion of our work with descriptive gazetteers, the formulaic nature of the original texts is of great assistance in automated parsing. For any given type of entry, we know what kinds of geographical names to expect: • Subject : the name of the place itself. Easily identified as it is the head word. • Containers: higher level units that contain the subject. We always look for a county, and sometimes look for a parish. In either case, our existing database provides an authoritative list of valid units. • Relatives: these are the generally larger settlements that are mentioned in relative locations. Identifying them from the surrounding phrasing should be straightforward, and it may prove helpful in identifying ‘larger’ places that our database also contains a mass of population statistics. 3 Travellers’ Accounts The travellers’ accounts are much less formulaic than the descriptive gazetteers and constitute a smaller body of text. While work on the gazetteers is well advanced, methodologies for the travel narratives are still being explored and our expectation is that we will use semimanual methods, identifying place-names and other elements by reading through the text but using software to assist in adding mark-up. The paper’s concern is not with this process, but with what features we should be identifying and how best to mark them up. Celia Fiennes’ travels in the 1690s are perhaps the purest example of journeys both undertaken and described for purely personal reasons. They were ‘begun to regain my health by variety and change of aire and exercise’ (p. ix), and her account was not published for nearly two hundred years. Here is her account of visiting Stonhenge: Thence 6 miles to Blandford, thence 18 to Salsebury and 8 mile to Newtontony which stands in y e midst of y e downs 8 mile from Andover a market town in Hampshire and y e roade to London. It lyes 15 mile from Winchester–it is three mile from Amesbury and 2 mile more to Stoneage that stands on Salsebury plaine–eminent for many battles being faught there–this Stoneage is reckon'd one of the wonders of England how such prodigeous stone should be brought there, as no such Stone is seen in y e Country nearer than 20 mile. They are placed on the side of a hill in a rude jregullar form–two stones stands up and one laid on their tops with morteses into each other and thus are severall in a round like a wall with spaces between, but some are fallen down, so spoyle the order or breach in the temple, as some think it was in the heathen tymes; others thinke it the Trophy of some victory wone by one Ambrosious, and thence the town by it has its name of. Amsebury. There is severall rows of lesser stones within the others set up in the same forme of 2 upright and one lies on the top like a gateway. How they were brought thither or whether they are a made stone is not resolved– they are very hard yet I have seen some of them scraped– the weather seemes not to penetrate them. To increase the wonder of the story is that none Can Count them twice alike–they stand confused and some single stones at a distance but I have told them often, and bring their number to 91. This Country is most Champion and open, pleasant for recreations–its husbandry is mostly Corn and sheep, the Downs though short grass y e feed is sweet, producing the finest wooll and sweet meat though but small. (pp. 9-10) We are also including narratives written by radical agitators. These reflect a personal interest which began with tramping artisans and moved onto working class autobiographies written by men who used the tramping system (see Southall 1991a, 1991b, 1996). Autobiographies generally described mobility in early adulthood, artisans being encouraged to travel to widen their experience after they completed their apprenticeships but before they married. Narratives written long after the events described tend to describe journeys very vaguely: ‘for sixteen months I tramped through the principal towns of Middlesex, Lancashire, and Yorkshire’ or, unhelpfully, ‘I need not follow my wanderings for some years, as my life at that time was of the ordinary kind.’ In contrast, the texts described here as “agitators’ narratives’ were written immediately after the events described, and themselves formed part of the agitation: they typically appeared in the relevant movement’s weekly or monthly newspapers. For now, we are concerned with three such narratives, although a number of others have been examined: The movements of Feargus O’Connor, arguably the principal leader of the Chartists movement of the 1830s and 1840s, were extensively reported in the Northern Star, which he edited; it also sold signed engravings of him. Scattered through the Star are a number of shorter narratives. For example, the issue of January 19 th 1839 contained a 4,500 word letter describing an eight day tour of Scotland. Secondly, ‘The Life and Rambles of Henry Vincent, written by himself’, which appeared as a series of articles in The Western Vindicator, a newspaper owned and edited by Vincent. The articles total c. 30,000 words, appeared in issues 3 (9th March 1839) to 13 (18th May 1839) and cover the period between February 26 th and May 10 th 1839. Vincent, the ‘Demosthenes of the West’, was arguably the leading figure in the Chartist movement in the west of England and South Wales. The narrative ends with Vincent in Monmouth gaol, and his release was the supposed aim of the Newport uprising in October 1839. Finally, a series of articles about and fairly clearly by Edwin Russell, an organiser employed by the National Agricultural Labourers’ Union in the 1870s. Between September 1872 and February 1873 a series of articles in the Labourers’ Union Chronicle, totalling about 10,000 words, described Russell’s travels, mainly in Herefordshire and Gloucestershire, in the autumn of 1872. All narratives will reflect the biases of the author, and both Cobbett and Young had well-defined agendas, but the agitators’ narratives are distinguished by the purpose not of their writing but of the journeys themselves. On occasion they adopted the conventions of orthodox travel writing; here, for example is Vincent’s description of his journey to Ledbury: Took coach for Ledbury, in Herefordshire. The scenery along the road is very beautiful. On the right, within four miles of Ledbury, stands Eastnor Castle, the residence of Lord Somers. The castle is a fine building, situate on a piece of rising ground, surrounded by a sheet of water, and bounded on either side by extensive plantations. The entrance into Ledbury is exceedingly pleasing. To the right, amidst a profusion of trees and shrubs, is Ledbury church. However, their basic goal was not to describe places but to change them, by creating new local branches of their organisations and establishing a sense of solidarity spanning places. This was very much a job of work: For March 27 th 1839 Vincent reported: A meeting was called in Newport for seven. Just before the meeting commenced the dark clouds rolled away — the rain ceased — and the silver moon looked smilingly upon us. We had above 4000 persons present. Edward Thomas took the chair. I delivered a thrilling oration to the people, which produced a pleasing effect. I felt in excellent spirits and tone notwithstanding my continued travelling and speaking; for I find, on calculating, I have spoke about two hours a day for thirteen months, and travelled six thousand and seventy-one miles. The Newport boys are advancing bravely. Here is Russell: I enrolled the 65 members’ names on the Madley branch book, filled up a lot of members’ cards, and put them in thorough good working order, and I think now they will be able to go on, but they do want such a lot of leading and guiding. I then walked on to Hereford, six or seven miles, in the pouring rain, it pouring all the way, and, as a consequence, I got wet though. I had a meeting planned for tonight at Wellington, but on account of the heavy rains which had fallen the rivers Lugg and Wye have overflowed their bank to such an extent that the roads are impassable, and I could not go to Wellington, but as I have got to arrange for a fortnight’s meeting on the other side of the county, I shall not be able this afternoon. 4 Marking-Up Travel Narratives Given these sources, how can we organise them to make them more accessible, and in particular to make it possible for, firstly, users to access information on specific places and, secondly, to generate maps of where travellers went? By now, this almost inevitably means some kind of mark-up system, i.e. adding tags to the text. Both common sense and the requirements of the program funding our work mean we should as far as possible follow existing standards rather than invent our own. The obvious starting point is the work of the Text Encoding Initiative (TEI). The TEI’s Guidelines for Electronic Text Encoding and Interchange were first published in April 1994 and were initially based on Standard Generalized Markup Language (SGML). The current version, TEI P4, is also compatible with XML (Extensible Markup Language), a subset of SGML which is now far more widely used than its parent (Sperberg-McQueen and Burnard, 2002). The full Guidelines can be downloaded from: http://www.tei-c.org The Guidelines are just that, and any specific project will need a more specific mark-up system. Before discussing the facilities within the Guidelines for identifying locations and geographic terms, it should be noted that some key TEI-compatible mark-up schemes make no use of them: • ‘TEI Lite’ is defined by the TEI itself as ‘a manageable subset of the full TEI encoding scheme’, but the only place-name tagging it provides for is place of publication. • The American Memory DTD (Document Type Definition) was developed by the Library of Congress to mark up historical texts included within its very d i v e r s e A m e r i c a n M e m o r y s y s t e m (http://lcweb2.loc.gov/ammem). This is of particular interest to us as our project is a major component within a consortium led by the British Library aiming to create a UK equivalent to American Memory. However, the AMS DTD again excludes geographical and place-name tags, and in fact the introduction to the DTD specifically states ‘it is too expensive to identify geographic names’. • The Minnesota “Women’s Travel Writing, 18301930” project have developed a WTW DTD based on the TEI. They use a TEI-defined mechanism for adding interpretative information on four themes: ethnicity, gender marking, transportation and women’s occupations. However, they do not tag place-names or geographic features. They note that issues have arisen in three main areas, gender, language and geography, and that ‘our work [on geography] has not progressed as far as in the first two categories’ (Remnek et al). This cannot pretend to be a systematic survey of TEIbased DTDs, but it is still surprising that so little use is made of the available facilities for geographical description. One project that is using these features is the Perseus Project, primarily concerned with classical literature. Their work is described in Crane et al, 2001, but the remainder of this section discusses the facilities provided by the overall TEI Guidelines for geographical mark-up, and their suitability for travellers’ tales. Firstly, the Guideline allow simple tagging of placenames such as: I went from New
منابع مشابه
Great Britain Historical Gazetteer/GIS
Possibly the single most important commitment of the Great Britain Historical GIS project under its funding from the New Opportunities Fund is the construction of a systematic historical gazetteer covering the administrative units of Great Britain over the last two centuries. This is important both as a reference resource in its own right and to tie together our diverse but always place-specifi...
متن کاملApplication of Geographic Information Systems(GIS) in the History of Cartography
This paper discusses applications of a revolutionary information technology, Geographic Information Systems (GIS), in the field of the history of cartography by examples, including assessing accuracy of early maps, establishing a database of places and historical administrative units in history, integrating early maps in GIS or digital images, and analyzing social, political, and economic infor...
متن کاملThe Great Britain Historical GIS
Although the first censuses were the only practical means of surveying a nation’s population, the development of sample survey methodologies means censuses are increasingly justified by the specifically geographical detail they uniquely provide. The same is arguably true of historical census studies: a two per cent sample taken from many millions is quite sufficient to study national class or h...
متن کاملSituation and Text: Representation of Migrants Whilst the Escalation of Refugee Crisis in Great Britain as Compared to Russia
Increasing migration is a vital concern for a globalizing sociocultural environment in today’s world. The UK and developed European countries have become an attractive destination for asylum seekers (labelled as “migrants”) in the past decade. The rapid rise in the number of asylum seekers, which was labelled “migration crisis” (Ruz, 2015), made this topic an integral part of scientific discuss...
متن کاملAnalysis and GIS to Examine Historical Accounts of the English Lake District
This paper reports on interdisciplinary research into the automated geographical analysis of historical text corpora. It provides an introduction to this research, which is being completed by two interrelated projects: the European Research Council-funded Spatial Humanities project and the Leverhulme Trust-funded Geospatial Innovation in the Digital Humanities project. In addition to contextual...
متن کامل"A Band on Every Corner": Using Historical GIS to Describe Changes in the Sydney and Melbourne Live Music Scenes
This paper demonstrates the use of historical Geographic Information Systems (historical GIS) to investigate live music in Sydney and Melbourne. It describes the creation of a tailored historical geodatabase built from samples of gig listings (comprising dates, locations, and performer names), and how this historical geodatabase offers insight into the changing dynamics of performers and venue ...
متن کامل