Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation

Authors

  • Luciene Novais Mazza Pontifícia Universidade Católica de São Paulo - PUCSP

Abstract

The present paper aims to demonstrate a computational tool developed to extract three-word lexical bundles and show – by working through this – the automatic recognition of recurring lexical items among regulatory documents. In this quantitative analysis a specific document prepared by pharmaceutical industries (in which the matter is directed related to the public health protection agencies) is generally examined. Nonetheless, the quantitative data collection methods can also be used to search any other linguistics features within a variety of genres and specific type of documents and it allows the linguistics researcher to easily identify which terms fall under a domain of specific texts. The study focus their main concern on investigating lexical pattern frequency of language use, particularly across the current context of business, and it seeks to spread Douglas Biber works based on recurrent word combinations that makes use of tools and techniques developed in corpus-based linguistics. As the theoretical framework for this study we primarily draw upon Corpus Linguistics, a theory that is able to connect its concepts over the computational assumptions and design tools for end users and extract the lexical bundles as well. The collected corpus gathers documents in English from fifteen different manufacturing sites of a multinational Pharmaceutical company, totaling about 110,000 words, whose limits include different writers among different geographic parts of the world. The investigation shows that it is possible to search text-internal features by the extraction of lexical bundle between data across the same specific-domain document.

Keywords: lexical bundles, domain-specific corpus, linguistic-computational tool.

Author Biography

Luciene Novais Mazza, Pontifícia Universidade Católica de São Paulo - PUCSP

Doutora em Linguística Aplicada e Estudos da Linguagem - Tradutora Técnica da PÓLUX Traduções XT. Professora Titular da Universidade Paulista - UNIP.

Published

2015-12-21

How to Cite

Mazza, L. N. (2015). Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation. Calidoscópio, 13(3), 424–439. Retrieved from https://www.revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13