Task-Oriented Query Reformulation with Reinforcement Learning
WHY?
Information retrieval from search engine becomes difficult when the query is incomplete or too complex. This paper suggests a query reformulation system that rewrite the query to maximize the probability of relevant documents returned.
WHAT?
Since the output of query reformulator is discrete, REINFORCE algorithm is used to train the model.
The query and the candidate term vectors are converted to a fixed-sized vector using CNN with max pooling or RNN. To generate a reformulated query sequentially, LSTM is used.
\phi_a(v)\\
\phi_b(e_i)\\
e_i \in q_0 \cup D_0\\
P(t_i|q_0) = \sigma(U^{\top}tanh(W(\phi_a(v)\|\phi_b(e_i))+b))\\
P(t_i^k|q_0) \propto exp(\phi_b(e_i)^{\top}h_k\\
h_k = tanh(W_a\phi_a(v)+W_b\phi_b(t^{k-1}+W_h h_{k-1})
To train the model, REINFORCE algorithm is used. Entropy regularization loss is add to encourage diversity.
C_a = (R - \bar{R})\sum_{t\in T} - log P(t|q_0)\\
\bar{R} = \sigma(S^{\top}tanh(V(\phi_a(v)\|\bar{e})+b))\\
\bar{e} = \frac{1}{N}\sum_{i=1}^N \phi_b(e_i)\\
N = |q_0 \cup D_0|\\
C_b = \alpha\|R - \bar{R}\|^2\\
C_H = -\lambda \sum_{t\in q_0\cup D_0} P(t|q_0) log P(t|q_0)
So?
Query reformulation showed better performance than raw retrieval, PRF models, or SL methods in TREC-CAR, Jeopardy and MSA data measured in recall, precision and mean average precision.