Context Navigation

Back to WAC8

WAC8: wac8-proceedings.tex

File wac8-proceedings.tex, 7.6 KB (added by egon w. stemle, 12 years ago)

Line
1	\newcommand{\thetitle}{Proceedings of the 8th Web as Corpus Workshop (WAC-8)
2	@Corpus Linguistics 2013}
3	\newcommand{\authora}{Stefan Evert}
4	\newcommand{\authorb}{Egon Stemle}
5	\newcommand{\authorc}{Paul Rayson}
6	\newcommand{\theauthors}{\authora, \authorb, \authorc}
7	% init geometry with these values to have them when fancyhdr loads
8	\PassOptionsToPackage{%
9	twoside=false,
10	top=1cm,
11	bottom=1cm,
12	left=2.5cm,
13	right=2.5cm,
14	includeheadfoot}
15	{geometry}
16	\PassOptionsToPackage{%
17	pdftitle={\thetitle},
18	pdfauthor={\theauthors},
19	pdfsubject={},
20	pdfkeywords={},
21	colorlinks=true,
22	linkcolor=blue,
23	bookmarkstype=pdf
24	}
25	{hyperref}
26
27	% use the easychair style
28	\documentclass[a4paper, onesided]{easychair}
29
30	% This provides the \BibTeX macro
31	\usepackage{doc}
32	\usepackage{makeidx}
33
34	% allow for inclusion of pdf documents
35	\usepackage{pdfpages}
36
37	%\makeindex
38
39	% from toc.tex
40	\usepackage{titletoc}
41	\titlecontents{subsubsection}[2pt]{\addvspace{10pt}\bfseries\titlerule[0.5pt]\filright}{}{}{}[]
42	\titlecontents{section}[0pt]{\addvspace{5pt}\filright}{}{}{\dotfill\contentspage}[]
43	\titlecontents{subsection}[10pt]{\addvspace{1pt}\itshape\filright}{}{}{}[]
44	\newcommand{\tocSection}[1]{\contentsline{subsubsection}{#1\\*\titlerule[0.5pt]\vspace{-9pt plus 2pt minus 2pt}}{}{}\nopagebreak[4]}
45	\newcommand{\tocTitle}[2]{\contentsline{section}{#1}{#2}{}\nopagebreak[4]}
46	\newcommand{\tocAuthors}[1]{\contentsline{subsection}{#1}{}{}}
47
48	\DeclareRobustCommand{\insertpdf}[4]{
49	\phantomsection
50	\addcontentsline{pdf}{section}{#4}
51	\addcontentsline{toc}{section}{#3}
52	\addcontentsline{toc}{subsection}{#2}
53	\fancyhead[LO,LE]{#2}
54	\fancyhead[RO,RE]{#4}
55	\includepdf[pagecommand={\thispagestyle{plain}}, pages=1]{#1}
56	\includepdf[pagecommand={\thispagestyle{fancy}}, pages=2-]{#1}
57	}
58
59	%% Document
60	%%
61	\begin{document}
62
63	%% Front Matter
64	%%
65	\pagenumbering{roman}
66	\title{\thetitle}
67
68	% Authors are joined by \and. Their affiliations are given by \inst, which indexes
69	% into the list defined using \institute
70	%
71	\author{\authora\inst{1} \and \authorb\inst{2} \and \authorc\inst{3}}
72
73	% Institutes for affiliations are also joined by \and,
74	\institute{
75	Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),
76	Erlangen, Germany\\
77	%\email{mokhov@cse.concordia.ca}
78	\and
79	European Academy of Bozen/Bolzano (EURAC),
80	Bolzano (BZ), Italy\\
81	%\email{geoff@cs.miami.edu}\\
82	\and
83	Lancaster University,
84	Lancaster, U.K.\\
85	%\email{andrei@voronkov.com, graham@cs.man.ac.uk}\\
86	}
87
88	\fancyfoot[LO,LE]
89	{S.Evert, E.Stemle, P.Rayson (eds.)}
90	\fancyfoot[CO,CE]
91	{WAC-8, 2013}
92	\fancyfoot[RO,RE]
93	{\thepage}
94
95	\fancypagestyle{plain}{%
96	\fancyhf{} % clear all header and footer fields
97	\fancyfoot[R]{{\normalsize\thepage}}
98	\renewcommand{\headrulewidth}{0pt}
99	\renewcommand{\footrulewidth}{0pt}}
100
101	% fine lines above footer and below header
102	\renewcommand{\headrulewidth}{0.4pt}\renewcommand{\footrulewidth}{0.4pt}
103
104	\clearpage
105	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
106	\maketitle
107	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
108	\thispagestyle{empty}
109	Web corpora and other Web-derived data have become a gold mine for corpus
110	linguistics and natural language processing. The Web is an easy source of
111	unprecedented amounts of linguistic data from a broad range of registers and
112	text types. However, a collection of Web pages is not immediately suitable for
113	exploration in the same way a traditional corpus is.
114
115	Since the first Web as Corpus Workshop organised at the Corpus Linguistics 2005
116	Conference, a highly successful series of yearly Web as Corpus workshops
117	provides a venue for interested researchers to meet, share ideas and discuss
118	the problems and possibilities of compiling and using Web corpora. After a
119	stronger focus on application-oriented natural language processing and Web
120	technology in recent years – with workshops taking place at NAACL-HLT 2010,
121	2011 and WWW 2012 – the 8th Web as Corpus Workshop returns to its roots in the
122	corpus linguistics community.
123
124	Accordingly, the leading theme of this workshop is the application of Web data
125	in language research, including linguistic evaluation of Web-derived corpora as
126	well as strategies and tools for high-quality automatic annotation of Web text.
127	The workshop brings together presentations on all aspects of building, using
128	and evaluating Web corpora, with a particular focus on the following topics:
129
130	\begin{itemize}
131	\item applications of Web corpora and other Web-derived data sets for
132	language research
133	\item automatic linguistic annotation of Web data such as tokenisation,
134	part-of-speech tagging, lemmatisation and semantic tagging
135	\item (the accuracy of currently available off-the-shelf tools is still
136	unsatisfactory for many types of Web data)
137	\item critical exploration of the characteristics of Web data from a
138	linguistic perspective and its applicability to language research
139	\item presentation of Web corpus collection projects or software tools
140	required for some part of this process (crawling, filtering,
141	de-duplication, language identification, indexing, ...)
142	\end{itemize}
143
144
145	\clearpage
146	\renewcommand\contentsname{Table of Contents}
147	\addcontentsline{pdf}{section}{Table of Contents}
148	\tableofcontents
149	\thispagestyle{plain}
150	\clearpage
151
152	%% main matter
153	%%
154	\thispagestyle{fancy}
155	\pagenumbering{arabic}
156	% paper_9.pdf paper_10.pdf paper_11.pdf paper_2.pdf paper_3.pdf paper_13.pdf paper_5.pdf paper_7.pdf paper_8.pdf paper_6.pdf paper_1.pdf paper_14.pdf
157
158	\insertpdf{paper_9.pdf}{A.Minocha, S.Reddy, A.Kilgarriff}{Feed Corpus : An Ever
159	Growing Up-to-date Corpus}{Feed Corpus}
160
161	\insertpdf{paper_10.pdf}{S.Wattam, P.Rayson, D.Berridge}{LWAC: Longitudinal
162	Web-as-Corpus Sampling}{LWAC}
163
164	\insertpdf{paper_11.pdf}{R.Sch\"afer, A.Barbaresi, F.Bildhauer}{The Good, the
165	Bad, and the Hazy: Design Decisions in Web Corpus Construction}{The Good, the
166	Bad, and the Hazy}
167
168	\insertpdf{paper_2.pdf}{J.Egbert, D.Biber}{Developing a User-based Method of
169	Web Register Classification}{Developing a User-based Method of Web Register
170	Classification}
171
172	\insertpdf{paper_7-mod.pdf}{A.Piperski, V.Belikov, N.Kopylov, E.Morozov,
173	V.Selegey, S.Sharoff}{Big and diverse is beautiful: A large corpus of Russian
174	to study linguistic variation}{Big and diverse is beautiful}
175
176	\insertpdf{paper_13.pdf}{D.Lutz, P.Cadwallader, M.Rooth}{A web application for
177	filtering and annotating web speech data}{Web application for filtering and
178	annotating web speech data}
179
180	\insertpdf{paper_5.pdf}{S.Schulz, V.Lyding, L.Nicolas}{STirWaC - Compiling a
181	diverse corpus based on texts from the web for South Tyrolean German}{STirWaC}
182
183	\insertpdf{paper_3.pdf}{A.Kilgarriff, V.Suchomel}{Web Spam}{Web Spam}
184
185	\insertpdf{paper_8.pdf}{A.Ferraresi, S.Bernardini}{The academic
186	Web-as-Corpus}{Academic Web-as-Corpus}
187
188	\insertpdf{paper_6.pdf}{S.Scheible, S.Schulte Im Walde, M.Weller, M.Kisselew}{A
189	Compact but Linguistically Detailed Database for German Verb Subcategorisation
190	relying on Dependency Parses from Web Corpora: Tool, Guidelines and
191	Resource}{Database for German Verb Subcategorisation}
192
193	\insertpdf{paper_1.pdf}{A.Brindle}{Thug breaks man's jaw: A Corpus Analysis of
194	Responses to Interpersonal Street Violence}{Thug breaks man's jaw}
195
196	\insertpdf{paper_14-mod.pdf}{C.Crangle}{A web-based model of semantic
197	relatedness and the analysis of electroencephalographic (EEG) data}{Web-based
198	model of semantic relatedness and the analysis of EEG data}
199
200	%\insertpdf{}{}{}{}
201
202	%------------------------------------------------------------------------------
203	\end{document}
204
205	% EOF

Download in other formats:

Original Format

ACL SIGWAC

Context Navigation

WAC8: wac8-proceedings.tex

Download in other formats: