文学 >>> 理论语言学 应用语言学 中国语言 外国语言 文学 艺术学 新闻学与传播学
搜索结果: 1-15 共查到文学 Corpora相关记录32条 . 查询时间(0.031 秒)
Microblogs such as Twitter, Facebook, and Sina Weibo (China’s equivalent of Twitter) are a remarkable linguistic resource. In contrast to content from edited genres such as newswire, microblogs cont...
The Penn Discourse Treebank (PDTB) was released to the public in 2008. It remains the largest manually annotated corpus of discourse relations to date. Its focus on discourse relations that are either...
In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate...
Automatic paraphrasing is an important component in many natural language processing tasks. In this article we present a new parallel corpus with paraphrase annotations. We adopt a definition o...
Since the Web by far represents the largest public repository of natural language texts, recent experiments, methods, and tools in the area of corpus linguistics often use the Web as a corpus. For app...
We present a novel method for discovering parallel sentences in comparable, non-parallel corpora.We train a maximum entropy classifier that, given a pair of sentences, can reliably determine whether o...
One can’t help but be fascinated by two sentences in parallel translation, the selfsame meaning diffused, distributed, diverging across alternative expressions. In his Le Ton beau de Marot: In Prais...
Exclamatives like What a dump!, Wow!, and Boy, you’ve grown! are, when uttered in context, rich in information about the speaker’s attitudes. Drawing on evidence from about 100, 000 online product rev...
We use large collections of online product reviews, in Chinese, English, German, and Japanese, to study the use conditions of expressives (swears, antihonorifics, intensives).The distributional eviden...
This paper aims to carve out a place for corpus research within theoretical linguistics and psycholinguistics. We argue that annotated corpora naturally complement native speaker intuitions and contro...
This paper presents a new method for producing a dictionary of subcategorization frames from unlabelled text corpora. It is shown that statistical filtering of the results of a finite state parser run...
A significant portion of the world’s text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages have multiple tags, but the...
Understanding the lead/lag of communities in the context of a given topic is an interesting problem in computational social science. In this work, we study the particular problem of whether research g...
Language documentation involves linguistic analysis of the collected material, which is typically done manually. Automatic methods for language processing usually require large corpora. The method pre...
This contribution explores the potentials of combining corpora of language use data with language description in e-grammars (or digital grammars). We present three directions of ongoing research and d...

中国研究生教育排行榜-

正在加载...

中国学术期刊排行榜-

正在加载...

世界大学科研机构排行榜-

正在加载...

中国大学排行榜-

正在加载...

人 物-

正在加载...

课 件-

正在加载...

视听资料-

正在加载...

研招资料 -

正在加载...

知识要闻-

正在加载...

国际动态-

正在加载...

会议中心-

正在加载...

学术指南-

正在加载...

学术站点-

正在加载...