方略学科导航

搜索结果: 1-15 共查到“文学 Corpora”相关记录32条 . 查询时间(0.031 秒)

Mining Parallel Corpora from Sina Weibo and Twitter Mining Parallel Corpora Sina Weibo Twitter 2016/7/7

Microblogs such as Twitter, Facebook, and Sina Weibo (China’s equivalent of Twitter) are a remarkable linguistic resource. In contrast to content from edited genres such as newswire, microblogs cont...

存档附件原文地址

Reflections on the Penn Discourse TreeBank,Comparable Corpora,and Complementary Annotation Penn Discourse TreeBank Comparable Corpora Complementary Annotation 2015/9/14

The Penn Discourse Treebank (PDTB) was released to the public in 2008. It remains the largest manually annotated corpus of discourse relations to date. Its focus on discourse relations that are either...

存档附件原文地址

Evaluating Centering for Information Ordering Using Corpora Evaluating Centering Information Ordering Using Corpora 2015/9/7

In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate...

存档附件原文地址

Constructing Corpora for the Development and Evaluation of Paraphrase Systems Paraphrase Systems Constructing Corpora 2015/9/6

Automatic paraphrasing is an important component in many natural language processing tasks. In this article we present a new parallel corpus with paraphrase annotations. We adopt a deﬁnition o...

存档附件原文地址

Orthographic Errors in Web Pages:Toward Cleaner Web Corpora Orthographic Errors Web Pages Cleaner Web Corpora 2015/9/1

Since the Web by far represents the largest public repository of natural language texts, recent experiments, methods, and tools in the area of corpus linguistics often use the Web as a corpus. For app...

存档附件原文地址

Improving Machine Translation Performance by Exploiting Non-Parallel Corpora Machine Translation Performance Exploiting Non-Parallel Corpora 2015/8/31

We present a novel method for discovering parallel sentences in comparable, non-parallel corpora.We train a maximum entropy classifier that, given a pair of sentences, can reliably determine whether o...

存档附件原文地址

Parallel Text Processing: Alignment and Use of Translation Corpora Translation Corpora Alignment 2015/8/26

One can’t help but be fascinated by two sentences in parallel translation, the selfsame meaning diffused, distributed, diverging across alternative expressions. In his Le Ton beau de Marot: In Prais...

存档附件原文地址

Exclamatives and heightened emotion: Extracting pragmatic generalizations from large corpora corpus pragmatics exclamatives expressives logistic regression 2015/6/15

Exclamatives like What a dump!, Wow!, and Boy, you’ve grown! are, when uttered in context, rich in information about the speaker’s attitudes. Drawing on evidence from about 100, 000 online product rev...

存档附件原文地址

The pragmatics of expressive content: Evidence from large corpora expressives intensives antihonorifics corpus pragmatics logistic regression Chinese, English German Japanese 2015/6/15

We use large collections of online product reviews, in Chinese, English, German, and Japanese, to study the use conditions of expressives (swears, antihonorifics, intensives).The distributional eviden...

存档附件原文地址

Developing linguistic theories using annotated corpora Developing linguistic theories annotated corpora 2015/6/15

This paper aims to carve out a place for corpus research within theoretical linguistics and psycholinguistics. We argue that annotated corpora naturally complement native speaker intuitions and contro...

存档附件原文地址

AUTOMATIC ACQUISITION OF A LARGE SUBCATEGORIZATION DICTIONARY FROM CORPORA AUTOMATIC ACQUISITION LARGE SUBCATEGORIZATION DICTIONARY CORPORA 2015/6/12

This paper presents a new method for producing a dictionary of subcategorization frames from unlabelled text corpora. It is shown that statistical filtering of the results of a finite state parser run...

存档附件原文地址

Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora Labeled LDA supervised topic model credit attribution multi-labeled corpora 2015/6/12

A significant portion of the world’s text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages have multiple tags, but the...

存档附件原文地址

Who Leads Whom:Topical Lead-Lag Analysis across corpora Who Leads Whom Topical Lead Lag Analysis across corpora 2015/6/10

Understanding the lead/lag of communities in the context of a given topic is an interesting problem in computational social science. In this work, we study the particular problem of whether research g...

存档附件原文地址

Unsupervised morphological analysis of small corpora: First experiments with Kilivila Unsupervised morphological analysis small corpora First experiments with Kilivila 2015/4/21

Language documentation involves linguistic analysis of the collected material, which is typically done manually. Automatic methods for language processing usually require large corpora. The method pre...

存档附件原文地址

Prospects for e-grammars and endangered languages corpora e-grammars endangered languages corpora 2015/4/21

This contribution explores the potentials of combining corpora of language use data with language description in e-grammars (or digital grammars). We present three directions of ongoing research and d...

存档附件原文地址

中国研究生教育排行榜-条

正在加载...

中国学术期刊排行榜-条

正在加载...

世界大学科研机构排行榜-条

正在加载...

中国大学排行榜-条

正在加载...

人　物-篇

正在加载...

课　件-篇

正在加载...

视听资料-篇

正在加载...

研招资料 -篇

正在加载...

知识要闻-篇

正在加载...

国际动态-篇

正在加载...

会议中心-篇

正在加载...

学术指南-篇

正在加载...

学术站点-篇

正在加载...

中国研究生教育排行榜-条

中国学术期刊排行榜-条

世界大学科研机构排行榜-条

中国大学排行榜-条

人 物-篇

课 件-篇

视听资料-篇

知识库-篇

研招资料 -篇

知识要闻-篇

国际动态-篇

会议中心-篇

学术指南-篇

学术站点-篇

人　物-篇

课　件-篇