Skip to content
Snippets Groups Projects
Commit d76a8d20 authored by wieting2's avatar wieting2
Browse files

added gitIgnore, now compiles

parent 67a77bbd
No related branches found
No related tags found
No related merge requests found
*.aux
*.out
*.log
*.blg
*.pdf
*.bbl
\ No newline at end of file
emnlp.bib 0 → 100644
This diff is collapsed.
......@@ -21,8 +21,46 @@
\usepackage{times}
\usepackage{url}
\usepackage{latexsym}
\usepackage{amsmath,amssymb}
\usepackage{multirow}
\usepackage{color}
\usepackage{graphicx}
\usepackage{bbm}
\usepackage{xspace}
\usepackage{wasysym}
\usepackage{latexsym}
\usepackage{graphicx}
\usepackage{algorithmic}
\usepackage{float}
\usepackage{mathtools}
\usepackage{array}
\usepackage{graphics}
\usepackage{comment}
\usepackage{caption}
\usepackage[hidelinks]{hyperref}
\captionsetup{font=footnotesize}
\newcommand{\jwcomment}[1]{\textcolor{cyan}{\bf \small [ #1 --JW]}}
\newcommand{\dev}{\textsc{dev}\xspace}
\newcommand{\test}{\textsc{test}\xspace}
\newcommand{\avg}{\textsc{avg}\xspace}
\newcommand{\mostsim}{\textsc{MostSim}\xspace}
\newcommand{\leastup}{\textsc{LeastUpdate}\xspace}
\newcommand{\skipgram}{skip-gram\xspace}
\newcommand{\glove}{glove\xspace}
\newcommand{\annoppdb}{Annotated-PPDB\xspace}
\newcommand{\boldparagram}{\textbf{Paragram}\xspace}
\newcommand{\paragram}{\textsc{paragram}\xspace}
\newcommand{\annoppdbthreek}{Annotated-PPDB-3K\xspace}
\newcommand{\mlpara}{ML-Paraphrase\xspace}
\newcommand{\wsall}{WS353\xspace}
\newcommand{\wssim}{WS-S\xspace}
\newcommand{\wsrel}{WS-R\xspace}
\newcommand{\simlex}{SL999\xspace}
\newcommand{\latentalign}{LatentAlign\xspace}
\newcommand{\newllm}{NewLLM\xspace}
\newcommand{\lclr}{LCLR\xspace}
%\setlength\titlebox{5cm}
% You can expand the titlebox if you need extra space
......@@ -56,23 +94,36 @@ We present a straight-forward algorithm that gives state-of-the-art results on t
Text similarity and textual entailment are important tasks for NLP \jwcomment{Talk about why these are important with citations. Not sure what to put here - maybe talk about the need for downstream task evaluations as well as the importance of word and phrase embeddings}. In this paper, we make the following contributions:
\vspace{2pt}
\noindent\textbf{Introduce a strong baseline} for both tasks. In fact, it gives state-of-the-art performance on the entailment task.
\noindent\textbf{Provide a state-of-the-art model} blah
\section{Related Work}
There has been a lot of interest in word embeddings ...
But recently a variety of methods have been developed to create phrase embeddings cute socher me naacl short.
In this work we show that with a straight forward and easy to implement Model we can ibtain state of the art on an entailmemt task and also use this midel to evaluate the suitability of these word and phrase embeddings on entailment and paraphrase tasks.
Latent alignment approaches have been used before fir paraphrase detection and textual entailment. Our work differs in twi major waysFor one, our model is not limited to binary classification. Secondly our model isvan online approach that linearly goes theough the training data. This in contrast to lclr that linearly goes through the positive examles in the dataset but goes theough the negative examples repeatedly until alignments are no longer added to a cach. Thus in practice , this creates unpredictability as the algorithm can be stuck in this loop a long time without making much progress towards the glibal solution. Lastly, our model is easy to implement and can be optimized with just sgd making it suitsble for the evaluation of word and phrase dmbeddings.
We evaluate on both the textual entailment snd paraphrase tasks from sem eval. We experiment with the two predominant embeddings in use today cite glive and skipgram as well as embeddings explicitely geared towards paragrasing cite me. We also experimnet with
\section{Latent Alignment Model}
Given two sentences, we seek to align each token to either: (1) a token in the other sentence or (2) a NULL token. In the second case. the token can be seen to be deleted. Thus given a loss function $l$, our model becomes:
\begin{equation}
$x+y$
x+y
\end{equation}
In the Sem-Eval task, textual entailment is a multi-class problem with three classes (Entailment, Contradiction, and Neutral), while textual similarity is a regression problem. We chose to model both problems as regression, by mapping the entailment labels to real numbers (Contradiction=0, Neural=1, Entailment=2) allowing us to be able to use the same model for both tasks. For a loss function, we chose regularized square loss:
\begin{equation}
$z+t$
end{equation}
z+t
\end{equation}
Our model, \latentalign, can of course handle arbitrary features, but one of the goals of this work was to create a strong baseline model. Thus we simply just used what we call "word-word" features. These are binary features that, for a given word pair, $w_1$ and $w_2$, have a value of 1 if $w_1$ is aligned to $w_2$ and 0 otherwise.
......@@ -141,7 +192,7 @@ Results
\section{Conclusion}
\bibliographystyle{ref.bib}
\bibliographystyle{emnlp.bib}
\bibliography{acl2015}
\end{document}
#!/bin/sh
pdflatex emnlp2015.tex
bibtex emnlp2015
pdflatex emnlp2015.tex
pdflatex emnlp2015.tex
\ No newline at end of file
rm *.log
rm *.aux
rm *.bbl
rm *.blg
rm *.out
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment