wiki:WAC7

Version 10 (modified by adam, 12 years ago) ( diff )

--

7th Web as Corpus Workshop (WAC-7)

To be held in association with WWW2012 in Lyon, France, 17th April 2012

Sponsored by ACL SIGWAC

More and more people are using Web data for linguistic and NLP research: the Web provides an easy source of linguistic data in a great variety of languages. However, a ‘crawl’ is not ready for exploration in the same way a traditional ‘corpus’ is. We need to turn a crawl into a corpus. The workshop, the seventh in an annual series, provides a venue for exploring what it involves, how to do it, and what we find out if we do.

We invite submissions which:

  • describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, de-duplication, language-id, tokenising, indexing, ...)
  • explore characteristics of Web data from a linguistics/NLP perspective including registers, domains, frequency distributions, comparisons between datasets
  • use crawled Web data for NLP purposes (with emphasis on the data rather than the use)

The previous WAC workshops have been co-located with various conferences in computational linguistics. This time the workshop co-locates with WWW2012, the main world conference on the Web technologies and their impact on the society.

wiki:Programme

Organising committee

  • Adam Kilgarriff (Lexical Computing Ltd.)
  • Serge Sharoff (University of Leeds, Workshop Chair)

Programme committee

Organising committee plus:

  • Silvia Bernardini, U of Bologna, Italy
  • Stefan Evert, U of Osnabrück, Germany
  • Cédrick Fairon, UCLouvain, Belgium
  • William H. Fletcher, U.S. Naval Academy, USA
  • Gregory Grefenstette, Exalead, France
  • Igor Leturia, Elhuyar Fundazioa, Basque Country, Spain
  • Preslav Nakov, National U of Singapore
  • Jan Pomikalek (Masaryk University)
  • Reinhard Rapp, U Mainz, Germany
  • Kevin Scannell, Saint Louis U, USA
  • Gilles-Maurice de Schryver, U Gent, Belgium
  • Pierre Zweigenbaum, LIMSI, France

Attachments (1)

Note: See TracWiki for help on using the wiki.