Named Entities Workshop (NEWS) 2009

 

Call for Papers

Workshop Focus

Named Entities (NEs) play a critical role in Natural Language Processing (NLP) and Information Retrieval (IR) tasks, such as search, machine translation, document clustering, summarization, information extraction, etc.  While identifying and analyzing NEs in a given natural language is a challenging research problem by itself, the phenomenal growth in the Internet user population, especially among the non-English speaking parts of the world, has extended this problem to the cross-language arena, making the handling of NEs in multiple languages critically important.

The purpose of this workshop is to bring together researchers interested in various aspects of NEs in natural language text.  In addition, the NEWS workshop will feature a shared task on Machine Transliteration of NEs.

Important Dates

Research Paper Submissions

Research  Paper Submission Deadline

1-May-2009

Shared Task 

Registration Opens

16-Feb-2009

Registration Closes

9-Apr-2009

Release Training/Development Data

16-Feb-2009

Release Test Data

10-Apr-2009

Results Submission Due

14-Apr-2009

Results Announcement

29-Apr-2009

Task (short) Papers Due

3-May-2009

For All Submissions

Acceptance Notification

1-Jun-2009

Camera-Ready Copy Deadline

7-Jun-2009

Workshop Date

7-Aug-2009

Topics of Interest

This workshop invites original research contributions on all aspects of Named Entities (NEs), including identification, analysis, extraction, mining, transformation and applications to NLP and IR systems.  The topics of interest include, but are not limited to the following:

NE Analysis

  • Distributional characteristics of NEs in mono- & multi-lingual corpora
  • Orthographic/phonetic characteristics of NEs
  • NE origin/genre recognition
  • Social network analysis and entity resolution

NE extraction

  • Language-independent monolingual NE extraction
  • Cross-language NE extraction
    • General techniques
    • Specific datasets (such as, Wikipedia, news, etc.)
  • Unsupervised and semi-supervised methods for NE extraction
  • Complex NEs, domain-specific term extraction
  • NE set expansion
  • Creation of annotated data

Machine Transliteration

  • Computational phonology, including modeling of phonological rules, structure, behavior, etc.
  • Transliteration modeling
    • Phonetic, phonetic-semantic transliteration, grapheme ® phoneme and phoneme ® grapheme conversions
    • Statistical and machine learning based approaches, transliteration unit alignment
    • Forward and backward transliterations
    • Learning transliteration from comparable corpora, transliteration lexicon construction
    • Romanization of Asian languages
  • Transliteration evaluation metrics

Applications

  • Monolingual and Cross-Language IR
  • Machine Translation
  • Information Extraction and Management
  • Question Answering
  • Computational Journalism

Paper Format

Paper submissions to NEWS 2009 should follow the ACL-IJCNLP-2009 paper submission policy, including paper format, blind review policy and title and author format convention. Full papers (research paper) are in two-column format without exceeding eight (8) pages of content plus one extra page for references and short papers (task paper) are also in two-column format without exceeding four (4) pages, including references. Submission must conform to the official ACL-IJCNLP-2009 style guidelines. For details, please refer to

https://acl-ijcnlp-2009.org/main/authors/stylefiles/index.html.

Paper Submission

Submission is electronic using paper submission software at:

https://www.softconf.com/acl-ijcnlp09/NEWS/.

 

Shared Task on Transliteration

Transliteration of NEs is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. This calls for high-performance transliteration systems, which is the focus of the shared task in this workshop.

Details of the task is available here.

Organizing Committee

Haizhou Li Institute for Infocomm Research
A Kumaran Microsoft Research India
Sanjeev Khudanpur Johns Hopkins University
Raghavendra Udupa Microsoft Research India
Min Zhang Institute for Infocomm Research
Monojit Choudhury Microsoft Research India

Program Committee

Kalika Bali

Microsoft Research India

Rafael Banchs

UPC, Spain

Sivaji Bandyopadhyay

Univ of Jadavpur, India

Pushpak Bhattacharyya

IIT-Bombay, India

Monojit Choudhury

Microsoft Research India

Marta Ruiz Costa-jussà

UPC, Spain

Jianfeng Gao

Microsoft Research, USA

Gregory Grefenstette

Exalead, France

Sanjeev Khudanpur

John Hopkins University, USA

Kevin Knight

ISI, USA

Greg Kondrak

Univ of Alberta, Canada

Olivia Kwong

City U., Hong Kong

Gina-Anne Levow

Univ of Chicago, USA

Arul Menezes

Microsoft Research, USA

Jong-Hoon Oh

NICT, Japan

Yan Qu

Advertising.com, USA

Dan Roth

Univ of Illinois, Urbana-Champaign, USA

Sunita Sarawagi

IIT-Bombay, India

Sudeshna Sarkar

IIT-Kharagpur, India

Richard Sproat

Univ of Illinois, Urbana-Champaign, USA

Keh-Yih Su

Behavior Design Corporation, Taiwan

Raghavendra Udupa

Microsoft Research, India

Vasudeva Varma

IIIT-Hyderabad, India

Min Zhang

Institute for Infocomm Research, Singapore

Contact Information

For any information about  the workshop or the shared task, please contact:

Dr. A. Kumaran
Microsoft Research India
Scientia, 196/36, Sadashivnagar 2nd Main Road, Bangalore
INDIA 560080
a.kumaran@microsoft.com

Copyright©2009 Chinese and Oriental Langauges Information Processing Society|| Last updated June 14, 2009