--- license: mit --- # Translation Tables for Probablistic Structured Queries This repository contains the raw translation tables for tha package [`fast_psq`](https://github.com/hltcoe/PSQ). Please refer to the GitHub for more information. The following is a brief example for using the tables. ## Get started `fast_psq` is available on PyPI. ```bash pip install fast_psq ir_datasets ir_measures ``` The following is an example indexing command. ```bash python -m fast_psq.index \ --doc_file irds:neuclir/1/zh/trec-2022 \ --lang zh \ --psq_file hltcoe/psq_translation_tables:zh.table.dict.gz \ --min_translation_prob 0.00010 \ --max_translation_alternatives 64 \ --max_translation_cdf 0.99 \ --docid doc_id \ --title title \ --body text \ --min_translation_prob 1e-4 \ --max_translation_alternatives 64 \ --output_dir ./indexes/neuclir-zh.f32/ \ --compression \ --nworkers 64 ``` The following command is an example for searching. ```bash python -m fast_psq.search \ --query_source irds:neuclir/1/zh/trec-2022 \ --query_field title \ --index_dir ./indexes/neuclir-zh.f32/ \ --qrels irds:neuclir/1/zh/trec-2022 \ --query_lang en \ --output_file ./neuclir-zh.en.title.f32.trec ``` ## Citation ```bibtex @article{psq-repro, title = {Efficiency-Effectiveness Tradeoff of Probabilistic Structured Queries for Cross-Language Information Retrieval}, author = {Eugene Yang and Suraj Nair and Dawn Lawrie and James Mayfield and Douglas W. Oard and Kevin Duh}, journal = {arXiv preprint arXiv}, year = {2024} } ```