| Linguistics 483 Data Files |
A. C. Brett acbrett@uvic.ca
Department of Linguistics University of Victoria Clearihue C139 |
Variable Position Description and Coding
________ ________ ___________________________________
WORD 1 - 19 Dictionary head word, coded as
alphanumeric
LANGUAGE 20 Etymological code identifying the
language from which the word entered
English, coded alphanumeric as:
'L' Latin
'G' Greek
'F' French
'A' Anglo-Saxon
'H' Hindi
'P' Persian
'S' Scandinavian
'I' Italian
'C' Spanish
'E' English
'D' Dutch
LETTERS 21 - 24 Number of letters in the word, coded
as integer
SOUNDS 25 - 28 Number of phonetic symbols in the
word, integer
SYLLBLES 29 - 32 Number of Syllable in the word,
integer
Variable Position Description and Coding
________ ________ ___________________________________
TYPE 3 - 4 Source document type, coded as:
'01', if the sentence be from
an informal narrative document
(newspaper or magazine), or
'02', if the sentence be from a formal
narrative source (technical
article or monograph).
WORDS 5 - 8 The total number of words in the
sentence, coded as integer.
LETTERS 9 - 12 the total number of letters in the
words comprising the sentence,
coded as integer.
STRUCTUR 15 - 16 Sentence structure, coded as:
'00', if the sentence be of simple
structure (one independent
clause), or
'01', if the sentence be compound (one
or more independent clauses, but
no dependent clause(s)), or
'10', if the sentence be complex (one
independent clause and one or
more dependent clause(s)), or
'11', if the sentence be compound/
complex (two or more independent
clauses and at least one
dependent clause).
Variable Position Description and Coding
________ ________ ___________________________________
SEQ.NUMB 1 - 3 Utterance Sequence Number, integer
LENGTH 4 - 6 Utterance Length in Morphemes,
integer
Observations on the first variable, named FAMILIAR, represent head turn durations elicited by pseudowords that were presented during the two-minute of the habituation phase of the experiment. Values on the NOVEL variable correspond to head turn durations resulting from pseudowords made up of the same syllables, but presented in different orders from the pseudowords in the habituation corpus.
Note that the data used here are not the Saffran et al. data. The data used in this example consist of a pseudorandom sample generated with approximately the same parameter estimates (statistics) as those reported by Saffran et al.
Position
Variable in Data Attribute Measured
Name Records or Identified
________ ________ ________________________
FAMILIAR 1 - 8 Familiar items head turn
times
NOVEL 9 - 16 Novel items head turn
times
Fundamental frequencies (in hertz), and the start and end times (in hundredths of milliseconds), of subjects were measured as they uttered the target two-digit numbers. Initial, midpoint, and final fundamental frequencies of the root number and the affix were recorded to yield six frequency observations (three for the root morpheme, and three for the affix), followed by four times (start and end times for each of the root and the affix, measured in hundredths of milliseconds), for each target number. Positions of these observations in the data records, together with the speaker identifier, the context (seven-digit number), the position of the target number in the context, and the target number itself, are as follows:
Position
Variable in Data Attribute Measured
Name Records or Identified
_______ ________ ______________________
SPEAKER 1(A) Speaker ID letter
CONTEXT 3-9(A) Complete number spoken
POSITION 11 Target number position
NUMBER 13-14 Target number
RF1 16-18 Root F0 at start
RF2 20-22 Root F0 at midpoint
RF3 24-26 Root F0 at end
AF1 28-30 Affix F0 at start
AF2 32-34 Affix F0 at midpoint
AF3 36-38 Affix F0 at end
RT1 40-44 Root V start time
RT2 46-50 Root V end time
AT1 52-56 Affix V start time
AT2 58-62 Affix V end time
Participants in the pilot project consisted of 64 teachers taking introductory linguistics courses during Summer Session.
The survey instrument included 405 items. For purposes of the analyses being undertaken, responses to related items were combined to produce measurements on 31 scales. Since different numbers of items were included in each scale, the scales were transformed so that the values on each of them ranged from 0 to 99.
The data file stored here consists of the measurements on these 31 transformed scales together with responses on the biographical variables.
Variable Position Description
________ ________ ________________________________________
AGE 1-3 Informant Age in Months
SEX 4 Informant Sex
GENER 5 Generation in Canada
STATUS 6 Father's Occupation Socioeconomic Status
S1A 7-8 Anomie
S1B 9-10 Ethnocentrism
S1C 11-12 Preference for Canada over USA
S1D 13-14 Preference for US over UK
S1E 15-16 Preference for Canada over UK
S1F 17-18 Attitude toward USA Relative to the UK
S1G 19-20 Authoritarianism
S1H 21-22 Sensitivity for Others
S1I 23-24 Canadian Nationalism & National Identity
S2A 25-26 Attitude toward Americans
S2B 27-28 Attitude toward Scots
S2C 29-30 Attitude toward English
S2D 31-32 Attitude toward Canadians
S2E 33-34 Subject Self-Rating: Others See Her/Him
S2F 35-36 Subject Self-Rating: Self Perception
S3A 37-38 Morphosyntactic: What the Subject Hears
S3B 39-40 Morphosyntactic: Use with Superiors
S3C 41-42 Morphosyntactic: Subject Uses with Peers
S4A 43-44 Phonological: Vowel before "R"
S4B 45-46 Phonological: Syllabicity
S4C 47-48 Phonological: Low Back Vowel
S4D 49-50 Phonological: Diphthong Raising
S4E 51-52 Phonological: "Y"-Glide
S4F 53-54 Phonological: "WH"-Feature
S4H 55-56 Phonological: Vowel Deletion
S4I 57-58 Phonological: "T"-Flapping
S4J 59-60 Morphophonemic: US/UK Dichotomy
S4K 61-62 Morphophonemic: Standard/Nonstandard Dialect
S4L 63-64 Lexical: UK/US Dichotomy
S4M 65-66 Lexical: Standard/Nonstandard Dialect
S4N 67-68 Lexical: Canadian Local Dialect
Note that the data used here are not the Alario et al. data.
Variable Position Description
________ ________ ______________________________
NOUN 1 - 8 Noun (alphanumeric)
NFREQ 10 - 12 Noun Frequency
ADJ 15 - 22 Adjective (alphanumeric)
AFREQ 24 - 26 Adjective Frequency
I_NFREQ 29 - 30 Noun Frequency Indicator
I_AFREQ 33 - 34 Adjective Frequency Indicator
TIME 37 - 41 NP Production Latency
The judges consisted of ten English teachers who were native speakers of English, ten native speakers of English who were not teachers, and ten English teachers whose native language was Greek. Each of the 32 sentences contained one error, and the judges evaluated the severity of the error on a 0 - 5 scale. For each sentence, the scores for each group of judges consisted of the sum of the severity assessment scores for the group.
The data are comprised of the sentence number and the total severity scores for the sentence determined by each of the three groups of judges.
Variable Position Description
________ ________ ______________________________
SENTENCE 1 - 2 Sentence number
E_TEACHR 3 - 6 English-speaking teacher
G_TEACHR 7 - 10 Greek-speaking English teacher
NONTEACH 11 - 14 English-speaking non-teacher
The Vocabulary Level corresponds to the size of the vocabulary of the children, and the Grammatical Complexity Scores for the English-speaking and Italian-speaking children measures the grammatical complexity of their utterances as assessed by their caregivers.
Variable Position Description
________ ________ _______________________________
VOCAB 1 - 4 Vocabulary Level
SYN_ENGL 5 - 10 Grammatical Complexity (English)
SYN_ITAL 11 - 16 Grammatical Complexity (Italian)
| Linguistics 483 | Home Page | Top of Page |