SIGIR2019検索実験ヒント集

Since authors are adopting and reviewers are accepting research with inappropriate retrieval baselines, the winter conference deadline season is a great time to offer some advice for running good retrieval experiments. #sigir2019 #icml2019 #acl2019 #kdd2019 #ijcai2019 1/11

2018-11-05 23:25:08

ACM SIGIR 東京支部 @acmsigirtokyo

不適切な検索ベースラインを採用した研究をする著者やそれを許容してしまう査読者がいる。冬の国際会議投稿シーズンは、適切な検索実験を実施するためのアドバイスをする絶好の機会であろう。 #sigir2019 #icml2019 #acl2019 #kdd2019 #ijcai2019 1/11 twitter.com/diazf_acm/stat…

2018-11-14 22:04:27

ACM SIGIR 東京支部 @acmsigirtokyo

アドホック検索、トピック追跡、分散検索、クエリ修正など多くの検索タスクがある。これらの検索タスクと自分の研究課題との関係を理解することで、多くの先行研究とデータを入手することができるし、それらを研究の土台として活用できる。 2/11 trec.nist.gov/data.html twitter.com/diazf_acm/stat…

2018-11-14 22:06:04

Fernando Diaz @841io

There are many retrieval tasks, including ad hoc retrieval, topic tracking, distributed IR, query reformulation. Understanding how your problem relates to these will give you access to a lot of prior work and data to use as a foundation. 2/11 trec.nist.gov/data.html

2018-11-05 23:25:25

ACM SIGIR 東京支部 @acmsigirtokyo

アドホック検索の実験（ユーザのキーワードもしくは質問に応じてテキスト文書をランク付け）をしたいなら、TRECの充実したデータと評価指標が役立つだろう。 3/11 twitter.com/diazf_acm/stat…

2018-11-14 22:06:34

Fernando Diaz @841io

If you are running ad hoc retrieval experiments (i.e. rank text documents based on user keywords or a question), TREC provides a great set of data and metrics for this. 3/11

2018-11-05 23:25:41

ACM SIGIR 東京支部 @acmsigirtokyo

TRECのテストコレクションを使った実験を標準化し、準備の手間を省くため、全てのクエリと判定結果を統一フォーマットに変換し、よく使われるクエリ/文書のサブセットに分けて出力するパッケージをリポジトリに置いている。4/11 github.com/diazf/trec-data twitter.com/diazf_acm/stat…

2018-11-14 22:07:32

Fernando Diaz @841io

In order to reduce some of the overhead and standardize experiments with these collections, I have a repository which will convert all queries and judgments into a single format broken down by common query/document subsets. 4/11 github.com/diazf/trec-data

2018-11-05 23:25:52

ACM SIGIR 東京支部 @acmsigirtokyo

また、ストップワードのリストもgithubで公開している。共通のリストを使いたい人はどうぞ。5/11 twitter.com/diazf_acm/stat…

2018-11-14 22:08:09

Fernando Diaz @841io

It also includes a stopword list if people want to share one. 5/11

2018-11-05 23:26:18

ACM SIGIR 東京支部 @acmsigirtokyo

Luceneよりも優れた現代的な文書ランキングを行うアルゴリズムのオープンソース実装がたくさんある。これらにはTRECのデータをそのまま使うことができる。 6/11 cs.cmu.edu/~callan/Papers… twitter.com/diazf_acm/stat…

2018-11-14 22:08:30

Fernando Diaz @841io

For algorithms, there are many open source implementations beyond Lucene that provide modern document ranking and can consume TREC data out of box. 6/11 cs.cmu.edu/~callan/Papers…

2018-11-05 23:26:30

ACM SIGIR 東京支部 @acmsigirtokyo

私はまだIndriを使っていて、よく利用される強力なベースライン（e.g. 近接検索や擬似適合性フィードバック、外部リソースを用いたクエリ拡張など）を簡単に試せるようにコードを更新している。7/11 github.com/diazf/indri twitter.com/diazf_acm/stat…

2018-11-14 22:08:56

Fernando Diaz @841io

I still use Indri and have updated the code to make it easy to test common strong baselines (e.g. proximity, pseudo-relevance feedback, external expansion). 7/11 github.com/diazf/indri

2018-11-05 23:26:40

ACM SIGIR 東京支部 @acmsigirtokyo

適切なベースラインの選択は慎重に行うこと。判断基準は、提案するアルゴリズムが近接性の情報を利用するか、コーパスの分析結果を利用するか、検索対象以外のデータを利用するかなど。8/11 github.com/diazf/indri twitter.com/diazf_acm/stat…

2018-11-14 22:09:15

Fernando Diaz @841io

You should pay particular attention to picking the right baseline based on whether your new algorithm has access to proximity information, corpus analysis, or non-target data. 8/11 github.com/diazf/indri#ba…

2018-11-05 23:26:53

ACM SIGIR 東京支部 @acmsigirtokyo

これらのベースラインはすべて、テキスト情報だけを用いてランキングを行う。テキスト以外に付随する情報やメタデータがある場合、もしくは、学習されたランキングモデルが必要な場合には、TerrierやRankLib, Solrなどが提供するランキング学習の実装を用いるとよい。 9/11 twitter.com/diazf_acm/stat…

2018-11-14 22:09:33

Fernando Diaz @841io

All of these baselines rank based purely on text content. If you have side information/document metadata or want a learned ranking model, you should use the Terrier, RankLib, or Solr implementations of learning-to-rank. 9/11

2018-11-05 23:27:03

ACM SIGIR 東京支部 @acmsigirtokyo

最後に、検索評価指標を正当化することを忘れないこと。NDCGやPrecision@k、MAPなどの指標は背後に強い仮定を置く。適切な指標は提案アルゴリズムに想定される利用方法に依存する。10/11 @ian_soboroff twitter.com/diazf_acm/stat…

2018-11-14 22:09:51

Fernando Diaz @841io

Finally, remember to justify your retrieval metrics. There are very strong assumptions behind metrics like NDCG, P@k, and MAP. The correct metric depends on how you expect your algorithm to be used. 10/11 @ian_soboroff

2018-11-05 23:27:13

ACM SIGIR 東京支部 @acmsigirtokyo

ここまでお付き合いいただきありがとうございました。特に #sigir2019 の投稿者には @peter_r_bailey の素晴らしい「SIGIR論文執筆ヒント集」を一読することをお勧めします。 11/11 microsoft.com/en-us/research… twitter.com/diazf_acm/stat…

2018-11-14 22:10:55

Fernando Diaz @841io

Thanks for your time and, for #sigir2019 authors specifically, I encourage you to read @peter_r_bailey's excellent "SIGIR Paper Writing Tips". 11/11 microsoft.com/en-us/research…

2018-11-05 23:27:28

ACM SIGIR 東京支部 @acmsigirtokyo

ちなみに、@peter_r_bailey の素晴らしい「SIGIR論文執筆ヒント集」の原文と日本語訳案はtogetter.com/li/993441 にまとまっています。

2018-11-14 22:11:47

いま話題のタグ