Metadata & Interoperable Digital Language Databases
by Helen Aristar-Dry and Gary Simons
The Electronic Metastructure for Endangered Languages Data (EMELD) has been a valued resource for linguists and archivists since its creation from 2001-2007. Funded by several generous grants from the National Science Foundation, EMELD was conceived with two goals in mind: 1) to aid in the preservation of endangered language data and documentation and 2) to aid in the development of the standards necessary for effective collaboration among electronic archives.
The project ran seven workshops that brought together data scientists and documentary linguists to develop standards and recommend practices in metadata creation, lexicon construction, text annotation, database design, and tool development; the workshops resulted in numerous seminal presentations and working group reports. Associated projects also provided digital maps for locating languages (LL-MAP) and a digital library of language relationships (Multi-Tree).
The online interface for EMELD includes a School of Best Practices with comprehensive instructions for creating and preserving digital data, demonstration projects from eleven disparate endangered languages, a database of tools and readings, and a suggested ontology for language description.
While the data accumulated by these projects remains available, the project interfaces and consequent project integration have not been updated since moving to the current host institution. The lecture will review this constellation of resources and set out the need for technical and content updates, as well as outlining the challenges in website design, digital language curation, annotation, and preservation that EMELD faces.
The lecture will be followed by a panel discussion led by Helen Arirstar-Dry, Gary Simons, Alexis Palmer (Assistant Professor, Linguistics, UNT), Sadaf Munshi (Associate Professor, Linguistics, UNT), Jeonghyun "Annie" Kim (Associate Professor, Information Science, UNT), Yunfei Du (Professor, Information Science, UNT), and Oksana Zavalina (Associate Professor, Information Science, UNT).
This lecture and panel discussion will be hosted in room 250H, at Willis Library (225 S. Avenue B, Denton, TX 76201). It will on Thursday, November 16, 2017, from 5:00—7:00 p.m.
Dr. Helen Aristar-Dry is a retired Professor of Linguistics at Eastern Michigan University and is now an Affiliated Researcher at UT-Austin. She was Principle Investigator on 12 National Science Foundations sponsored projects, including E-MELD. She along with Anthony Aristar received the Victoria A. Fromkin Lifetime Service Award from the Linguistic Society of America in 2003.
Her current research interest is in language technology and previously included Linguistic Stylistics, Pragmatics, and Discourse Analysis.
She is a board member of Elsevier's Scirus Scientific Advisory Board and Advisory Board Member of the Linguistics Research Center of UT-Austin. She and Anthony Aristar are the co-founders of the LINGUIST List and she was a Moderator on LINGUIST List for 23 years. She is a member of the Linguistic Society of America.
Dr. Gary F. Simons is the Chief Research Officer at SIL International and Executive Editor of the Ethnologue. He is a co-founder of the Open Language Archives Community and co-developer of the ISO 639-3 standard of three-letter identifiers for all known languages of the world. A prolific author, his most recent book 'Sustaining language use: Perspectives on community-based language development' was co-written with Melvyn Paul Lewis.
His current research interests include Digital linguistic archiving; Markup languages and text encoding; Computational linguistics; Programming languages;and Historical and comparative linguistics.
He is a member of the Association for Computational Linguistics, the Association for Computing Machinery, the Linguistics Society of America, and the Text Encoding Initiative Consortium.