The second plenary at the 13th Teaching and Language Corpora conference held in Cambridge in July 2018 was delivered by Anne O’Keefe of Mary Immaculate College, University of Limerick (see TaLC 2018 conference links and references for first plenary by Susan Hunston). Her topic was the interface of data-driven learning (DDL) with second language acquisition (SLA).
O’Keeffe’s talk brought together a discussion of the interface debate in SLA recently addressed in Han and Finneran (2013) with language learning theories underpinning data-driven learning tackled by Lynne Flowerdew (2015). Her slides are here. I start by looking at the background to the interface debate and the learning theories in turn, using O’Keeffe’s references (in her slides and below) plus some other reading.
The interface debate in SLA
Han and Finneran (2013) provide the background to this debate which compares explicit, conscious learning of second language lexis, morphosyntax, and phonology with implicit, intuitive identification of constructions and form-function mapping. Explicit learning is operationalised as metalinguistic knowledge used in careful writing and pedagogical exercises, for example, while implicit learning is thought to inform spontaneous, unplanned spoken production under pressure of time, for instance.
The interface question concerns the relationship between the two. The strong interface position claims that everything in language is learnable and teachable, and that explicit learning becomes implicit over time (cf skill acquisition theory, DeKeyser 2007). The weak interface position maintains a view that everything is learnable, and posits limited conversion of explicit to implicit learning. R Ellis (2002, 2008) claims that explicit learning can become implicit only when a learner is developmentally ready. N Ellis (2002, 2005) takes the usage-based position that the majority of L2 learning is implicit, but explicit learning can “re-tune” L1-influenced pattern detectors to help learners attend to relevant features of L2 input. Finally the non-interface position states that not everything is learnable, that language acquisition is too complex to be accomplished through explicit learning, and that what is learned explicitly cannot be accessed implicitly (“is not deployable in real, spontaneous communication,” Han & Finneran, 2013: 4). This last position is most famously espoused by Krashen (1977) and has strongly influenced L2 teaching via communicative language teaching (CLT).
Han and Finneran (2013) review studies supporting each position and note an evolution in SLA research from a non-interface position in the 1980s, when Chomskyan Universal Grammar dominated linguistic theory, to a weak-interface position following an increase in research in instructed SLA since 2000. These authors interpret ongoing debate on the interface question as evidence for the validity of each, and call for more fine-grained analysis to determine which domains of SLA seem to allow which type of interface. They go on to review evidence from a study of fossilisation showing variability across learners and morphosyntactic structures, which they interpret in the light of the interface debate. Backsliding and synchronous variability, they claim, suggest no interface, while the maintenance in interlanguage of nontargetlike forms suggests a strong interface. This interpretation “speaks to a possible co-existence of the presence and absence of explicit-implicit interface within any given interlanguage” (Han & Finneran 2013: 12). They therefore conclude with a call for a research programme to investigate “which aspects of grammar are susceptible to a strong interface, a weak interface, or no interface across and within second language learners? (Han & Finneran 2013: 14).
Language learning theories and DDL teaching
Flowerdew’s chapter on language learning theories announces three different approaches to anchoring DDL activities in the larger language learning enterprise, one cognitivist (noticing in usage-based approaches) and the others constructivist (including discovery and experiential learning, as well as learner agency); she adds a section on learning styles.
For me one of the best rationales for DDL remains Cobb (2005). Data-driven learning involves learners in “grappling with raw data” as Cobb explains in his discussion of constructivism:
representations constructed from grappling with raw data, as opposed to representations resulting from someone else’s having grappled, are not just generally “better” in some vague way but specifically are more successfully transferred to novel contexts and form a better preparation for further independent learning. This paradigm also proposes a methodology for helping learners perform this grappling with raw data, namely the adaptation of the tools and methods that experts have developed over the years to help them with their own grappling. Like learners, experts in any domain experience difficulties in their encounters with unencoded data, but unlike learners they have developed tools and methods to overcome these difficulties
Cobb thus values inductive over deductive approaches to data-driven vocabulary learning, for instance. His work is cited in Flowerdew (2015), a chapter cited by O’Keeffe and outlined in the table below. The pedagogical implications and DDL activities are from Flowerdew’s chapter with some supplementary definitions and links which I have added.
“intake is that part of the input that the learner notices”
learners’ acquisition of linguistic input is more likely to increase if their attention is consciously drawn to linguistic features” (16)
concordance-based tasks requiring students to attend to recurrent phrases would seem to be an ideal means for enhancing learners’ input via noticing, leading to uptake (20)
teacher-directed noticing activities
hypothesis formation through inductive corpus-based exercises
explicit explanations from the teacher to confirm or correct these hypotheses
hypothesis testing through follow-up exercises
pattern-hunting (turning up ideas and expressions) versus pattern-defining (checking a specific target pattern)
exploratory, experiential, discovery, process-based learning
“learners should not be handed fully formed or “pre-emptively encoded” word meanings, but rather should grapple with raw evidence, constructing their own meanings out of numerous partial encounters with instances” Cobb (2005)
“the more possible starting points a corpus offers for exploitation, the more likely it is that there exists an appropriate starting point for a specific learner” (Widman et al 2011) (24)
students toggle between the ‘inductive’ sup-corpora and the ‘deductive’ grammar guide (25)
SACODEYL European Youth Language https://www.um.es/sacodeyl/
Chemnitz internet grammar (CING)
“socially situated models of cognition, […] view interaction as a context for cognition, rather than vice versa. Rather than judging interaction in terms of its outcome for learning, learning is viewed an an inevitable outcome of interaction” Whyte (2011)
inductive vs deductive learning, field dependence vs field independence
group work in association with corpus activities
learner choice to encourage greater autonomy
depth of knowledge: building an integrated lexicon
breadth of knowledge: increase vocabulary size
O’Keeffe argues that the SLA debate over the respective contributions of explicit and implicit learning has important consequences for DDL. She outlines the three positions described by Han and Finneran (2013) and draws the following implications:
the strong interface position implies a teaching focus on forms, that is, overt teaching of a discrete-item grammatical syllabus
the weak interface position implies focus on form_, that is, a meaning-focused activities with occasional brief switches of attention to form
the non-interface position implies a teaching focus on meaning only.
She discusses the type of empirical results required to test the validity of each hypothesis, suggesting the importance of research design and instruments with respect to test items, tasks, learner factors, and teachers. She suggests that if the strong interface position is correct, learners will use forms correctly in controlled and free tasks. If the non-interface position is correct, errors will occur in free but not controlled tasks. More nuanced findings would indicate a weak interface (slide 21). She concludes in agreement with Han and Finneran (2013) that all three positions are likely to be valid and that language teaching and teaching research should pursue the three hypotheses.
I have a number of reservations about both the explicit/implicit learning controversy in SLA and the application of learning theories in DDL as presented here. I feel that somewhat hasty conclusions with far-reaching conclusions about language teaching with corpora are being drawn without full consideration of acquisitional or pedagogical issues. And I agree with O’Keeffe that it is important for the field and its place in both L2 research and in language education that the community should take a clear position on issues that should inform the design of DDL activities and programmes, and the research that is conducted on their effectiveness.
Among the shortcuts I see in the research cited by O’Keeffe are
a broad church approach to explicit versus implicit learning
With respect to the interface question, Han and Finneran 2013 draw this conclusion which O’Keeffe reiterates:
in view of the conflicting arguments across the interface and non-interface positions, it appears likely that each position has some validity to it. (2013: 7)
To me this is akin to saying that since some climate scientists deny a human role in global warming we should continue to keep an open mind. Just because a debate exists doesn’t mean all sides are equally valid, and it is a simple affair to keep a putative controversy alive artificially. Perhaps more seriously, it seems to me that the weak interface is already a compromise between strong and weak positions, and it allows researchers to investigate where and how explicit and implicit learning influence interlanguage development. In other words, if you accept the weak interface argument that L2 development can include both implicit and explicit learning, you do not also need (cannot logically also) accept both or either the strong position (all explicit learning can become implicit, or all implicit learning derives from an explicit foundation) or the non-interface position (explicit learning can never become implicit). Long (2017) has an interesting discussion of implicit/explicit and incidental/intentional learning which I think is relevant here (see this post).
Having said that, the actual importance of the explicit/implicit learning interface for DDL practitioners is perhaps less than critical. Learner use of language corpora implies explicit learning. It suggests a focus on formS to gain declarative knowledge of collocations and colligations, for example, as opposed to the type of focus on form_ advocated by communicative and task-based language teaching approaches, where all learning of forms is embedded in meaning-focused activities. To do this you need to accept a strong or weak interface position at least implicitly (unless you are prepared to settle for explicit L2 learning which never leads to acquisition, which seems unusual to say the least). I can see no teacher or learner use of language corpora where implicit learning might be triggered. This leads me to my second reservation.
2. pedagogical implications of SL learning theories
The research on instructed L2 learning cited by L. Flowerdew (2015) represents a wide range of approaches which seem more or less compatible with DDL. Usage-based research (N Ellis, see also Susan Hunston’s TaLC 2018 plenary) obviously has a natural affinity with DDL since they share an interest in corpora and corpus tools and a cognitivist agenda, using authentic data to shed light on input available to language users as well as characterise their output. Constructivism, too, makes sense in the form of Cobb’s argument that learning from one’s own “grappling” with authentic language is likely to be more effective in terms of retention and transfer to new contexts than grappling by proxy, that is, explicit teaching of the results of a teacher’s or material writer’s engagement with authentic language.
I am less convinced of the application of Vygotskyan sociocultural theory to DDL, unless in a diluted form where corpus work alternates with other forms of language use and learning, in which case DDL becomes one element of a wider approach to language teaching and learning, rather than a learning programme in its own right. Learners can share searches and work in groups, and receive scaffolding and feedback from peers and teachers, but actual interaction with a corpus remains an individual endeavour. Thus it is more compatible with a cognitivist approach which seeks to account for learning occurring in one learner’s mind, rather than the socially informed learning of interest to socio-constructivists.
Similarly I think it is somewhat questionable to see concordance lines as so many opportunities for noticing. I understand Schmidt’s original formulation of noticing as something akin to Tomlin and Villa’s (1994) notion of detection, that is, something attended to during language use rather than language study, and thus closer to implicit rather than explicit learning. Again Long (2017) has a discussion of these distinctions as well as of empirical work on teaching/learning applications. These pedagogical studies (Cintron-Valentin & Ellis 2015, Malone 2016) are based on computer-delivered materials which are of less interest to me as a language educator working with classroom technologies (as opposed to the computer suites required for these materials), but may be interest to DDL colleagues working in computer-rich environments.
I also have reservations about equating the freedom of the search box with learner choice, agency, and autonomy as both Flowerdew and O’Keeffe have done. While it is undoubtedly preferable to offer learners as much choice as possible in building and using corpora and to cater for different levels of proficiency, motivation, and so on, I think we have to be careful about viewing language corpora as a silver bullet (as with other CALL tools and pedagogies). Regarding DDL, a number of papers at TaLC 2018 showed that even advanced learners were often ill-equipped to find the information they wanted via corpus tools (Charles), and sometimes wrong in the conclusions they drew (Tyne).
In conclusion, it should be clear if only from the length of this post that I find the SLA/DDL intersection an interesting area for research and teaching. I think it important that L2 researchers and corpus linguists engage with each others’ work. Perhaps this cross-fertilisation is more developed in SLA research than in L2 pedagogy. If so, TaLC is no doubt an important forum for focusing on DDL pedagogy informed by research in L2 education.
[blue = O’Keeffe’s references, black = my additions]
Boulton, A. and Tom Cobb (2017) “ Corpus Use in Language Learning: A Meta-Analysis”. Language Learning, 67(2): 348-393.
Chapelle, C. A. (2003) English language Learning and Technology. Amsterdam: John Benjamins. Ellis, N. (2002). Frequency Effects in Language Processing: A Review with Implications for Theories of Implicit and Explicit Language Acquisition. Studies in Second Language Acquisition, 24 (2), 143-188. doi:10.1017/S0272263102002024
Cintron-Valentin, M. and Ellis, N. (2015) Exploring the interface: explicit focus-on- form instruction and learned attentional biases in L2 Latin. Language Learning 37(2): 197–235. https://doi.org/10.1017/s0272263115000029
Cobb, T. (2005). Constructivism, applied linguistics, and language education. Encyclopedia of Language and Linguistics, 2nd. ed.
DeKeyser, R. (2007). Skill acquisition theory. Theories in second language acquisition: An introduction, 97113.
Ellis, N. (2005) At the interface: dynamic interactions of explicit and implicit language knowledge. Studies in Second Language Acquisition, 27.2: 305–52.
Ellis, N. C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in second language acquisition, 24(2), 143-188.
Ellis, R. (2008) Investigating grammatical difficulty in second language learning: Implications for second language acquisition research and language testing. International Journal of Applied Linguistics 18.1: 4–22
Ellis, R. (2002) Does form-focused instruction affect the acquisition of implicit knowledge? A review of the research. Studies in Second Language Acquisition 24.2: 223–36.
Flowerdew, L. (2015). Data-driven learning and language learning theories: Whither the twain shall meet. In A. Leńko-Szymańska & A. Boulton (Eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, pp. 15–36.
Graus , J. & Coppen , P.-A. 2016. Student teacher beliefs on grammar instruction. Language Teaching Research , 20(5): 571-599.
Han, Z.-H., & Finneran, R. (2014). Re-engaging the interface debate: Strong, weak, none, or all? International Journal of Applied Linguistics, 24(3), 370-389.PDF
Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An experimental study on ESL relativization. Studies in second language acquisition, 24(4), 541-577.
Johns, T. (1994) “From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning”. In T. Odlin (Ed.), Perspectives on Pedagogical Grammar. New York: Cambridge University Press, pp. 293-313.
Krashen, S. 1977. Some Issues Relating to the Monitor Model. In H. Brown, C. Yorio , and R. Crymes ( eds ). On TESOL ’77, Washington DC: Teachers of English to Speakers of Other Languages, pp.144-158.
Laufer, B. and Hulstjin , J. (2001) “Incidental vocabulary acquisition in a second language: The construct of task-induced involvement’. Applied Linguistics, 22: 1-26.
Lee, H., Warschauer , M. and Lee, J.-H. (2018) “The Effects of Corpus Use on Second Language Vocabulary Learning: A Multilevel Meta-analysis”. Applied Linguistics, (advance online: hIps://academic.oup.com/applij/advance-ar]cle-abstract/doi/10.1093/applin/amy012/4953772 )
Long, M. (2017). Instructed second language acquisition (ISLA): geopolitics, methodological issues, and some major research questions. ISLA, 1(1): 7-44. PDF
Malone, J. (2016) Incidental vocabulary learning in SLA: effects of frequency and aural enhancement (Qualifying Paper. PhD in SLA Program). College Park: University of Maryland.
Papp, S. Inductive learning and self-correction with the use of learner and reference corpora. In Hilgado , E., Quereda , L. and J. Santana ( eds ). Corpora in the Foreign Language Classroom. Amsterdam: Rodopi , pp. 207 – 220.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction. Cambridge: Cambridge University Press, pp. 3-32.
Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied linguistics, 11(2), 129-158.
Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in second language acquisition, 16(2), 183-203.
Whyte, S. (2017). Focus on form(s): principles and practice. On teaching languages with technology.
Whyte, S. (2011). Socio-constructivism. Learning and Teaching Foreign Languages, UOH.
Language teaching and language corpora: second TaLC plenary The second plenary at the 13th Teaching and Language Corpora conference held in Cambridge in July 2018 was delivered by Anne O'Keefe of Mary Immaculate College, University of Limerick (