The Voynich Manuscript is Written in Natural Language: The Pahlavi Hypothesis

نویسنده

  • J. Michael Herrmann
چکیده

The late medieval Voynich Manuscript (VM) has resisted decryption and was considered a meaningless hoax or an unsolvable cipher. Here, we provide evidence that the VM is written in natural language by establishing a relation of the Voynich alphabet and the Iranian Pahlavi script. Many of the Voynich characters are upside-down versions of their Pahlavi counterparts, which may be an effect of different writing directions. Other Voynich letters can be explained as ligatures or departures from Pahlavi with the intent to cope with known problems due to the stupendous ambiguity of Pahlavi text. While a translation of the VM text is not attempted here, we can confirm the Voynich-Pahlavi relation at the character level by the transcription of many words from the VM illustrations and from parts of the main text. Many of the transcribed words can be identified as terms from Zoroastrian cosmology which is in line with the use of Pahlavi script in Zoroastrian communities from medieval times. 1 Why is Voynichese difficult? All writing systems in the world [8, 5] require some effort in acquisition and use. While for some groups of languages, difficulty and differences may be comparatively small [17], in others the complexity of the script can appear forbidding for all but a minority of scribes. Religious observance, for example, may require the adherents to continue using a script or language that no longer adapts to its language environment and that may thus tend to become ambiguous or incomprehensible. In order to retain a unique pronunciation and, supported by extensive commentaries, continuing understandability, glyphs (diacritics) from were added to letters to distinguish them, or additional letters (matres lectionis) were inserted to represent sounds (such as vowels in the consonant-based (abjad) scripts. However, such additional efforts may not be considered necessary, if the oral tradition in the community is sufficiently strong, such that the texts do not have to be extracted from the writing itself, but are rather remembered while being read. If the Voynich Manuscript (VM, MS 408 in the Beinecke Rare Book & Manuscript Library at Yale University) derives from such a tradition, the difficulty in reading it may be understandable. 1 ar X iv :1 70 9. 01 63 4v 2 [ cs .C L ] 6 O ct 2 01 7 The Voynich Manuscript (VM, MS 408 in the Beinecke Rare Book & Manuscript Library at Yale University) which is written on more than 200 vellum pages has been dated between 1404 and 1438 (University of Arizona, 2011), but its history is largely unknown until the discovery by the bookseller Voynich in 1912. Apart from a few cautious attempts, such as Ref. [3], so far little progress has been achieved in deciphering the VM nor even a decision was reached whether the VM has any meaningful content at all [19]. Our hypothesis that the VM is written in natural language, is to be evidenced by showing that the script used in the VM is directly related to Pahlavi, a writing system that was in use for several Iranian languages from before the current era at least until 900 [9]. Pahlavi is a particular case of a language that is notoriously difficult to read. It was used in medieval scriptures, commentaries, and a few other texts [2] related to Zoroastrianism, the pre-Islamic religion of Persia. Over the few centuries of the language evolution, many Pahlavi letters have coalesced, e.g. for the phonemes d, g, j, and y, only a single letters is retained in Pahlavi. Moreover, letters are usually joined in Pahlavi script and can appear thus similar to other letters: E.g., in addition to its proper meaning, a letter can be indistinguishable from as much a sixteen different phoneme or letter combinations [11]. In some words, corrupted forms of letters have become a standard that is accepted to various degrees by the scribes. In addition to Persian words, Pahlavi contains also a large number of heterograms, i.e. around a thousand, partially very common words of Aramaic origin that are meant to be read in Persian (like the Latin abbreviation i.e. is read in English as that is). Finally, as for many other ancient texts, material decay, language drift, scribe errors, unfamiliarity with the original cultural context, and, possibly, the need of the writers to hide the content from contemporary hostility, also contribute to the difficulty of reading the text. Concerning recent work on the VM, statistical approaches [1, 10, 14] that search for nonrandom features in data may be bound to fail if the target is quite random to begin with. The standard Voynich character set (EVA) [7] is not too helpful either, because it is unrelated to the phonemics, it breaks some of the letters into smaller parts, and fails to identify ligatures, all of which may further reduce the strength of the statistical analysis, cf. [20, 21, 19, 10]. In addition, the extensive 19th century literature dedicated to religious writing, see e.g. [15] was difficult to access until scanned copies became available online recently, and, finally, it may be construed that our academic habits thwart the systematic study of matters as obscure as the VM. The Pahlavi hypothesis was proposed informally already in 2005 [18]1. The hypothesis is based there on the similarity of the numbers of letters (“14 17”) in the Voynich and Pahlavi alphabets and on a general perception of a topical relation to the Bundahesh and the Denkard. Also a small sample of words lengths from a Pahlavi text was included, but was not compared to a transliteration of the Voynich text. The present paper aims at providing evidence for the hypothesis that the VM is a readable text with an interest in itself. Our approach consists in establishing a relation between the Voynich and Pahlavi scripts (see Section 2). It will also become clear that only within a cooperation among experts in Pahlavi philology, Zoroastrianism, history of medicine, botany, 1I was not aware of this news-group post until I found a Twitter comment on the first version of the current paper where Ref. [18] was mentioned.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Co-Occurrence Patterns in the Voynich Manuscript

The Voynich Manuscript is a medieval book written in an unknown script. This paper studies the distribution of similarly spelled words in the Voynich Manuscript. It shows that the distribution of words within the manuscript is not compatible with natural languages.

متن کامل

How the Voynich Manuscript was created

The Voynich manuscript is a medieval book written in an unknown script. This paper studies the relation between similarly spelled words in the Voynich manuscript. By means of a detailed analysis of similar spelled words it was possible to reveal the text generation method used for the Voynich manuscript.

متن کامل

Unsupervised Analysis of the Voynich Manuscript

The aim of this project is to research the possibilities of applying unsupervised learning techniques for natural language and other sequential data to undeciphered texts and manuscripts. The undeciphered text used is the Voynich Manuscript, a mysterious book from the 15th or 16th century that is written in an unknown script. Some methods that could be applied to manuscripts such as these will ...

متن کامل

Statistical Analysis of Unknown Written Language: The Voynich Manuscript

The Voynich Manuscript is a document written in an unknown language or cipher. This research proposal presents an idea into determining possible relationships within the Voynich. This is to be performed through known statistical methods relating to linguistics. The document reviews previous research carried out by other researchers. The proposed method is given and shows the current results obt...

متن کامل

Analysis of Letter Frequency Distribution in the Voynich Manuscript

The Voynich manuscript is one of the biggest mysteries in linguistic science. Although a lot of researches are being made, the author, the origin and the content of the manuscript still remain unknown. In this work letter frequency distributions of about 300 languages were compared to one of the language in the Voynich manuscript. The study shows the most similar languages according to this cha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1709.01634  شماره 

صفحات  -

تاریخ انتشار 2017