Archive for February, 2012

Processing Prefabs

Posted in Lazy Linguist on February 2, 2012 by consortiumoffools

This is a paper my linguistics group and I wrote describing our study of prefabricated utterances stored as a single lexical unit.

Apparently it was supposed to be bilingual, but I’m pretty sure we only had L1 English speakers.


Processing prefabs:

An experiment involving differences between L1 and L2 English speakers

Proposal # 522


In this experiment, we aim to find out if lexical units can be stored as prefabricated collocations (prefabs). If we find these prefabs are stored as specialized entities within the lexicon, then we will attempt to explain the differences between native speakers and second language speakers of English with regard to prefab retention and recognition. This study emulates others in the field, yet we will collect a broader range of data as means to the end of real-world predictability and generalizability.


Our experiment follows the path set forth by Vogel Sosa and MacFarlane in 2002, among others. Vogel Sosa and MacFarlane found that their test subjects had a lower chance of detecting their target word, ‘of,’ when it was within a known collocation such as ‘kind of.’ The researchers determined that L1 English speakers do not perceive the ‘of’ in a frequent collocation as readily as they perceive an infrequent variant. This study concludes that native English speakers holistically store frequent prefabs, but it does not tackle the issue of English as a second language speakers.

The study conducted by Leśniewska and Witalisz, however, does address the issue of L1 versus L2 speakers. The researchers tested both native Polish speakers and native English speakers on both Polish and English frequent collocations. By testing each participant with a series of common and uncommon prefabs, the researchers found that there were sociolinguistic factors in play. Frequent prefabs exist in many languages, and many of them do not overlap into other languages. A prefab in Polish, with its own grammatical qualities, does not translate into English in an understandable way. In a similar fashion, some English prefabs are not acceptable in Polish. This study shows that selecting appropriate prefabs is key in creating meaningful data.

Britt Erman (2007) wrote a study that addresses the importance of prefab collocations. In this study, the researcher looks at pauses and their duration as well as their proximity within known collocations. The study concludes that speakers have less dead air in between individual frequent collocations during normal discourse.

An experiment conducted at the University of Colorado, as seen in Oliver, Healy, and Mross (2005) looked at letter detection and comprehension. The only aspect of this study we chose to address is the small sample size. Their sample was composed of 32 CU undergrads working for class credit. We feel that to gain a more accurate data set, we need to expand our limitations beyond college students.

By taking the other studies into account, we can create a logical methodical experiment. We will use similar methods as they have, yet try to weed out the unknown variables and come upon more precise data. We will follow in the footsteps of Vogel Sosa and MacFarlane by testing for English prefabs, but we will test L2 English speakers as well as L1 speakers. In our study we will only use English prefabs, in order to skirt issues that Leśniewska and Witalisz encountered when they studied prefabs from both languages. We will not test for pauses, yet we will keep in mind the conclusions found by Erman (2007). We will also broaden the sample size, as opposed to the small sample size used by Oliver, Healy, and Mross (2005).

We want to determine if there is a clear difference between L1 and L2 speaker reception of prefabbed collocations, and to what degree each group differs. We hypothesize that all speakers (both L1 and L2) will provide faster and more accurate responses to the infrequent collocations than the frequent collocations, but that the differential in accuracy and response time to the two types of collocations will be greater for the native speakers than the non-native speakers. Our study will add to the field of Linguistics by explaining correlations between prefab collocation recognition by L1 and L2 English speakers as well as connecting ideas generated by others before us.


Participants: 25 native English speakers and 25 non-English speakers ranging in ages from 19 to 56 with varying backgrounds. 30 participants are female, the other 20 are male. The non-native English speakers are originally from Mexico, Russia, Puerto Rico, the Dominican Republic, Brazil, Germany, the Philippines, China, and a few from the US. The speakers that originate from the US grew up learning Spanish in the Southwest.

Materials: Each subject will take a computer test in the same room a roughly the same time during the day. The computer program is a sound file document that includes the target word many times. Participants need only to be able to hear the speech and press the space bar when they recognize the target word.

Procedure: Each subject, native and non-native, will be preliminarily tested in order to find out the exact level of fluency. They will be graded on a 1-5 scale, 1 being the least fluent in English and 5 being the most. The test involves the participant reading a paragraph out loud. We then determine the reading time and the number of intraphrasal pauses and rate the speaker accordingly.

Participants will then listen to 50 sentences and are asked to hit the space bar when they hear the target word, ‘of’. The target word occurs in high frequency (e.g. ‘kind of’) and low frequency (e.g. point of) collocations. Accuracy and reaction time will be measured by a computer program and the researchers will evaluate and analyze the data collected.

Predicted Results:

We predict that both English L1 speakers and their L2 counterpersons will be faster at identifying the target word located in infrequent collocations. The major difference between the L1 and L2 speakers will be that the L2 English speakers will take longer to identify any of the collocations with much accuracy. We also predict that their response time will be delayed due to the fact that they do not posses the language knowledge and experience that the L1 English speakers have. Further, we predict that measures on the English fluency task will be negatively correlated with the reaction time on the experimental task within each participant group.


Erman, Britt. (2007). Cognitive processes as evidence of the idiom principle. International Journal of Corpus Linguistics, Vol. 12 (1), 25-53.

Leśniewska, Justyna & Witalisz, Ewa. (2007). Cross-linguistic influence and acceptability judgments of L2 and L1 collocations: A study of advanced Polish learners of English. EUROSLA Yearbook, Vol. 7, 27-48.

Oliver, William L., Healy, Alice F., Mross, Ernest F. (2005). Trade-offs in Detecting Letters and Comprehending Text. Canadian Journal of Experimental Psychology, Vol 59 (3), 159-167.

Sosa Vogel, Anna & MacFarlane, James. (2002). Evidence for frequency-based constituents in the mental lexicon: collocations involving the word of. Brain and Language, Vol. 83. 227-236