Changes between Version 13 and Version 14 of WAC-X


Ignore:
Timestamp:
04/11/16 15:41:11 (8 years ago)
Author:
Roland Schäfer
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WAC-X

    v13 v14  
    88August 12, 2016, Berlin[[BR]]
    99
    10 '''[#cfp The Call for Papers is out!]'''
     10Contact email: wacx2016@gmail.com[[BR]]
     11
     12'''[#cfp 11 April 2016: The SECOND Call for Papers is out!]'''
    1113
    1214
    1315== WAC-X main workshop ==
    1416
    15 The World Wide Web has become increasingly popular as a source of linguistic data, not only within the NLP communities, but also with theoretical linguists facing problems of data sparseness or data diversity. Accordingly, web corpora continue to gain importance, given their size and diversity in terms of genres/text types. The field is still new, though, and a number of issues in web corpus construction need much additional research, both fundamental and applied. These issues range from questions of corpus design (e.g., assessment of corpus composition, sampling strategies and their relation to crawling algorithms, and handling of duplicated material) to more technical aspects (e.g., efficient implementation of individual post-processing steps in document cleaning and linguistic annotation, or large-scale parallelization to achieve web-scale corpus construction). Similarly, the systematic evaluation of web corpora, for example in the form of task-based comparisons to traditional corpora, has only recently shifted into focus. For almost a decade, the ACL SIGWAC (http://www.sigwac.org.uk/), and especially the highly successful Web as Corpus (WAC) workshops have served as a platform for researchers interested in compilation, processing and application of web-derived corpora. Past workshops were co-located with major conferences on computational linguistics and/or corpus linguistics (such as EACL, NAACL, LREC, WWW, and Corpus Linguistics).
     17The World Wide Web has become increasingly popular as a source of linguistic data, not only within the NLP communities, but also with theoretical linguists facing problems of data sparseness or data diversity. Accordingly, web corpora continue to gain importance, given their size and diversity in terms of genres/text types. The field is still new, though, and a number of issues in web corpus construction need much additional research, both fundamental and applied. These issues range from questions of corpus design (e.g., assessment of corpus composition, sampling strategies and their relation to crawling algorithms, and handling of duplicated material) to more technical aspects (e.g., efficient implementation of individual post-processing steps in document cleaning and linguistic annotation, or large-scale parallelization to achieve web-scale corpus construction). Similarly, the systematic evaluation of web corpora, for example in the form of task-based comparisons to traditional corpora, has only recently shifted into focus. For almost a decade, the ACL SIGWAC ([http://www.sigwac.org.uk/]), and especially the highly successful Web as Corpus (WAC) workshops have served as a platform for researchers interested in compilation, processing and application of web-derived corpora. Past workshops were co-located with major conferences on computational linguistics and/or corpus linguistics (such as EACL, NAACL, LREC, WWW, and Corpus Linguistics).
    1618
    17 WAC-X will also feature the final workshop of the EmpiriST 2015 shared task "Automatic Linguistic Annotation of Computer-Mediated Communication / Social Media" (see https://sites.google.com/site/empirist2015/ for details) and the panel discussion "Corpora, open science, and copyright reforms" (see https://www.sigwac.org.uk/wiki/WAC-X#paneldisc for details).
     19WAC-X will also feature the final workshop of the EmpiriST 2015 shared task "Automatic Linguistic Annotation of Computer-Mediated Communication / Social Media" (see [https://sites.google.com/site/empirist2015/] for details) and the panel discussion "Corpora, open science, and copyright reforms" (see https://www.sigwac.org.uk/wiki/WAC-X#paneldisc for details).
    1820
    1921=== Organizers ===
     
    2325* [http://hpsg.fu-berlin.de/~rsling Roland Schäfer (Freie Universität Berlin)]
    2426* [http://iiegn.eu/work Egon Stemle (European Academy of Bozen/Bolzano)]
     27
     28Contact email: wacx2016@gmail.com[[BR]]
    2529
    2630=== Important dates ===
     
    6670=== Submission format ===
    6771
    68 All submissions must be in PDF format and should follow the ACL 2016 style guidelines. We strongly recommend the use of the ACL 2016 LaTeX style files or Microsoft Word Style files. The style files and example documents will be available from the workshop website or directly from http://acl2016.org. We reserve the right to reject submissions that do not conform to these styles including font and page size restrictions.
     72All submissions must be in PDF format and should follow the ACL 2016 style guidelines. We strongly recommend the use of the ACL 2016 LaTeX style files or Microsoft Word Style files. We reserve the right to reject submissions that do not conform to these styles including font and page size restrictions.
     73
     74* [http://acl2016.org/files/acl2016.zip "Download ACL 2016 style files here"] (or directly from [http://acl2016.org/index.php?article_id=9])
    6975
    7076Full paper submissions may consist of up to eight (8) pages of content plus any number of pages consisting of only references. Short papers may consist of up to four (4) pages of content plus any number of pages consisting of only references. Full papers will be distinguished from short papers in the proceedings.