Document Type : Articles


1 English Department, Kashan University

2 Knowledge and information Science Payame Noor University, Mashhad, Iran


As a member of larger familyof formulaic sequences, lexical bundles play different discourse functions in written research articles. This study investigated the use of four-word lexical bundles in published research articles in medicine via natural language processing by computational linguistics. A corpus of 2,420,914 words was extracted from 790 research articles in 33 medical disciplines. For the identification of lexical bundles, a number of computer software products such as ABBYY FineReader 10 professional edition, Total assistant, Antconc 3.2.3, and WordSmith Tools 5 were used. The identified lexical bundles were classified structurally and functionally based on the taxonomies in the literature. The results of the study showed that 102 identified lexical bundles differ structurally and functionally and most of the writers of medical research articles rely on text-oriented bundles for establishing their written academic discourse. This study provided new insights in understanding the discipline-specific discourse of medical research articles and in doing further corpus-based research in written academic discourse and EAP. This research introduced stylistic linguistics point of view in information retrieval systems development.

  1. Altenberg, B. (1998). On the phraseology of spoken English: the evidence of recurrent word-combinations. in A. P. Cowie (Ed.), Phraseology: theory, analysis and applications (pp. 101–122). Oxford: Oxford University Press.
  2. Anthony, L. (2007). Antconc 3.2.1: A free text analysis software. Retrieved from
  3. Biber, D. (2004). Lexical bundles in academic speech and writing. in: lewandowska-tomaszczyk B. (Ed.) Practical Applications in Language and Computers (pp. 165-178). Frankfurt: Peter Lang.
  4. Biber, D., &Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for Specific Purposes, 26, 263–286.
  5. Biber, D., Conrad, S., & Cortes, V. (2004). ‘If you look at lexical bundles in university teaching and textbooks’.Applied Linguistics, 25(3), 371–405.
  6. Biber, D., Johansson, S., Leech, G., Conrad, S., &Finegan, E. (1999). The Longman Grammar of Spoken and Written English. London: Longman.
  7. Chen, Y., & Baker, P. (2010). Lexical Bundles in L1 and L2 academic writing. Language Learning & Technology, 14(2), 30–49.
  8. Cortes, V. (2002).Lexical bundles in academic writing in history and biology. Doctoral dissertation, Northern Arizona University.
  9. Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for Specific Purposes, 23, 397–423.
  10. De Cock, S. (1998). A recurrent word combination approach to the study of formulae in the speech of native and non-native speakers of English.International Journal of Corpus Linguistics, 3(1), 59–80.
  11. De Cock, S., Granger, S., Leech, G., &McEnery, T. (1998). An automated approach to the phrasicon of EFL S. Granger (Ed.), Learner English on computer (pp. 67–79). London: Longman.
  12. Dufon, M. (1995).The acquisition of gambits by classroom foreign language learners of M. Alves (Ed.), Papers from the 3rd annual meeting of theSoutheast Asian Linguistic Society (pp. 27–42). Tempe: Arizona State University, Program for Southeast Asian Studies.
  13. Erman, B. (2007). Cognitive processes as evidence of the idiom principle. International Journal of Corpus Linguistics, 12, 25–53.
  14. Firth, J. R. (1951).Modes of meaning.Essays and Studies (The English Association), 118–149.
  15. Gass, s., & Mackey, A. (2002).Frequency effects and second language acquisition.Studies in Second Language Acquisition, 24, 249-260.
  16. Hakuta, K. (1974). Prefabricated patterns and the emergence of structure in second language acquisition.Language Learning, 24, 287–297.
  17. House, J. (1996). Developing pragmatic fluency in english as a foreign language. Studies in Second Language Acquisition, 18, 225–252.
  18. Hyland, K. (2008a). Academic clusters: text patterning in published and postgraduate writing. International Journal of Applied Linguistics, 18(1), 1-9.
  19. Hyland, K. (2008b). As can be seen: lexical bundles and disciplinary variation.English for Specific Purposes, 27, 4–21.
  20. Jalali, H. (2009). Lexical Bundles in Applied Linguistics: Variations within a Single Discipline.Unpublished doctoral thesis, University of Isfahan, Isfahan, Iran.
  21. Jespersen, O. (1924). The philosophy of grammar. London: George Allen and Unwin.
  22. Karlgren, J. (2000).Information retrieval; statistics and linguistics.a short introduction to textual information retrieval. Sweden: Kista, Swedish Institute of computer science, human machine interaction and language engineering laboratory.
  23. Marco, M. J. (2000). Collocational frameworks in medical research papers: a genre based study. English for Specific Purposes, 19, 63-86.
  24. Martinez, I. (2003). Aspects of theme in the method and discussion sections of biology journal articles in English.Journal of English for Academic Purposes, 2(2), 103–123.
  25. Nattinger, J., & De Carrico, J. (1992).Lexical phrases and language teaching. Oxford: Oxford University Press.
  26. Nekrasova, T. (2009).English L1 and L2 speakers’ knowledge of lexical bundles.Language Learning, 59(3), 647–686.
  27. Parvizi, N. (2011). Identification of discipline-specific lexical bundles in education.Unpublished master’s thesis, University of Kashan, Kashan, Iran.
  28. Salem, A. (1987). Pratique des segments re´pe´te´s. Paris: Institut National de la Langue Franc¸aise. In Tremblay, A, et al.(2007). Are lexical bundles stored and processed as single units?.Proc. 23rd Northwest Linguistics Conference, Victoria BC CDA.
  29. Schmidt, R. W. (1990). The role of consciousness in second language learning.Applied Linguistics, 11, 129–158.
  30. Schmitt, N., Grandage, S., &Adolphs, S. (2004). Are corpus-derived recurrent clusters psycholinguistically valid? In N. Schmitt (Ed.), Formulaic Sequences (pp. 127–152). Amsterdam: John Benjamins Publishing.
  31. Scott, M. (2008).WordSmith Tools version 5. Liverpool: Lexical Analysis Software.
  32. Stubbs, M. (2007a). An example of frequent English phraseology: distribution, structures and functions. in R. Facchinetti (Ed.), Corpus Linguistics 25 years on (pp. 89–105). Amsterdam: Radopi.
  33. Stubbs, M. (2007b). Quantitative data on multi-word sequences in English: the case of word ‘world’. in M. Hoey, M. Mahlberg, M. Stubbs & W. Teubert (Eds.), Text, Discourse and Corpora: Theory and Analysis (pp. 163–189). London: Continuum.
  34. Valipoor, L. (2010). A corpus-based study of words and bundles in chemistry research articles.Unpublished master’s thesis, University of Kashan, Kashan, Iran.
  35. Wang, J., Liang, Sh. &Ge, G. (2008).Establishment of a medical academic word list.English for Specific Purposes 27, 442–458.
  36. Wood, D. (2006). Uses and functions of formulaic sequences in second language speech: an exploration of the foundations of fluency. Canadian Modern LanguageReview, 63, 13–33.
  37. Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University Press.