Annotated Corpus of Hittite clauses

(Letters, Instructions, with addition of Myths and Prayers)

Maria Molina
The aim of the project is to develop an online syntactically annotated corpus of Hittite, a dead Indo-European language of 18–12 cc. BC.

Nowadays new electronic and online corpora for different languages emerge every new day. Hittite remains practically the only major Indo-European language with a significant corpus of texts without an openly available syntactically annotated corpus. There is neither a UD treebank nor a bank of constituent structures.

As the oldest attested Indo-European language, Hittite proves to be more and more interesting for researchers. Lots of papers on linguistics, including syntax and morphology, have been published during the last decade, so the need for an online Hittite corpus is more and more compelling.

The project of a syntactically annotated Hittite corpus started in Moscow in May 2014 at the Institute of Linguistics (Russian Academy of Sciences, Moscow) and managed to annotate around 6000 clauses before 2019. Starting from January 2016, a MsSQL relational database system is being developed for the corpus, with an online search interface and limited public access to the database materials.

In 2020-2022 the project was suspended due to the lack of funding, and the CoViD-19 pandemic. Due to the political situation in Russia, it has moved to Israel together with Maria Molina, the only person responsible for this corpus. The corpus is now represented mostly as a bank of XLS-files, but the process of updating the database has re-started in September 2022. We hope to get the job done before the spring of 2023.

The corpus is based on two publications, Hittite letters by [Hoffner 2009] and Hittite instructions [Miller 2013]. Hittite prayers and myths are also partly included, based on an online corpus of texts by Philipps Universitaet Marburg's project: Gebete der Hethiter.

All texts have been analyzed and digitalized according to the project guidelines.

