Прочетен: 815 Коментари: 0 Гласове:
Последна промяна: 02.02.2007 02:50
Lecture №5: Lexicon Data and Their Structure, By Thorsten Trippel
Sixth Session: 21.11.2006
Summary of the session: *-*
Quizzes: *-*
Homework: *-*
GO TO THE NEXT LECTURE: Lecture 6 *-*
This lecture was conducted by Mr Thorsten Trippel who substituted for Pr Gibbon only on that day.
Summary of the session:
-
This session is more or less a revision of what we have done with Mr Gibbon so far but from a different point of view. Besides, Mr Trippel introduced some new concepts and terms.
-
Overview of the session
-
LEXICON DATABASE STRUCTURES
-
Microstructure: 1. number of lexicon articles/entries/records 2. order of DATCAT
-
Mesostructure: 1. Interrelation of lexicon entries 2. relation to external information
-
Macrostructure: 1. Order of lexicon entries 2. selection of sort key 3. sorting order not trivial
-
-
LEXICON MICROSTRUCTURE
-
words (most): for instance, in a pictorial dictionary there are no words but pictures
-
grammatical information: POS, inflectional class, valence, etc.
-
representation of meaning: semantics, definition
-
corpus reference: usage examples
-
-
DETOUR: corpus
-
collection of language material: texts, speech
-
with additional information: POS, lemma, transcription/annotation
-
with a specific structure: interlinear glossing, special make-up
-
-
TYPES OF LEXICON
-
semasiological lexicons – a form is mapped on its semantics
-
onomasiological lexicons – a meaning is mapped on its form (Thesaurus or Termbases)
-
-
Other types of lexicons:
-
word frequency lexicon – the most frequent one is first
-
lexicon of phrasal verbs – by part of speech and a special structure
-
rhyming lexicon – by word ending
-
picture lexicon – by prototype (models of what we know in the real world)
-
-
Problematic issues in lexicography
-
ambiguity:
-
synonyms – two word forms, same meaning
-
polysemy – one word form, two (or more) slightly different meanings
-
homonyms – one word form, meaning completely different
-
-
search word:
-
languages with inflectional prefixes
-
orthographic ambiguity
-
picture lexicons?
-
-
language change:
-
“new” words
-
new meaning
-
-
-
Solutions to the problems:
-
ambiguity: enumeration
-
search word: arbitrary definition
-
language change: new edition
-
more fundamental solution
-
-
LEXICON CREATION
-
methods of creating lexicons
-
introspection
-
questionnaire
-
corpus
-
-
Introspection based lexicon creation:
-
“look inside” - by trained linguists
-
reflecting one"s one language use
-
social filter: relevance (which words are used: swearwords, taboo words), importance, adequacy
-
-
Questionnaire based lexicon creation:
-
in comparative linguistics
-
typology
-
unknown languages
-
requirements and limitations
-
-
Example Questionnaire:
-
asking questions for translation, explanation
-
social filter apply
-
-
Corpus based lexicon creation:
-
“reflect the evidence”: include words found, exclude items not in the corpus
-
based on corpora: list all words (wordlist), words in context (concordance), distribution analysis (HMM)
-
flat tabular lexicon: only two columns
-
generalizations in the lexicon
-
declarative lexicons
-
-
Corpus based lexicon creation application:
-
SIL TOOLBOX
-
interlinearization of text: one line “base” text, one line gloss, one line morphology
-
Quizzes:
No quizzes today!
Homework:
No homework for today!
GO TO THE NEXT LECTURE: Lecture 6 *-*