Sheshera Mysore commited on
Commit
517ce65
1 Parent(s): 1d36a62

Add own and pc profiles

Browse files
Files changed (48) hide show
  1. data/users/afan/embeds-afan-doc.npy +3 -0
  2. data/users/afan/embeds-afan-sent.pickle +3 -0
  3. data/users/afan/pid2idx-afan-doc.json +1 -0
  4. data/users/afan/seedset-afan-maple.json +735 -0
  5. data/users/agloberson/embeds-agloberson-doc.npy +2 -2
  6. data/users/agloberson/embeds-agloberson-sent.pickle +2 -2
  7. data/users/agloberson/pid2idx-agloberson-doc.json +1 -1
  8. data/users/agloberson/seedset-agloberson-maple.json +0 -0
  9. data/users/dbelgrave/embeds-dbelgrave-doc.npy +3 -0
  10. data/users/dbelgrave/embeds-dbelgrave-sent.pickle +3 -0
  11. data/users/dbelgrave/pid2idx-dbelgrave-doc.json +1 -0
  12. data/users/dbelgrave/seedset-dbelgrave-maple.json +527 -0
  13. data/users/ddowney/embeds-ddowney-doc.npy +2 -2
  14. data/users/ddowney/embeds-ddowney-sent.pickle +2 -2
  15. data/users/ddowney/pid2idx-ddowney-doc.json +1 -1
  16. data/users/ddowney/seedset-ddowney-maple.json +0 -0
  17. data/users/hzamani/embeds-hzamani-doc.npy +2 -2
  18. data/users/hzamani/embeds-hzamani-sent.pickle +2 -2
  19. data/users/hzamani/pid2idx-hzamani-doc.json +1 -1
  20. data/users/hzamani/seedset-hzamani-maple.json +0 -0
  21. data/users/jbragg/embeds-jbragg-doc.npy +2 -2
  22. data/users/jbragg/embeds-jbragg-sent.pickle +2 -2
  23. data/users/jbragg/pid2idx-jbragg-doc.json +1 -1
  24. data/users/jbragg/seedset-jbragg-maple.json +263 -25
  25. data/users/jtomczak/embeds-jtomczak-doc.npy +3 -0
  26. data/users/jtomczak/embeds-jtomczak-sent.pickle +3 -0
  27. data/users/jtomczak/pid2idx-jtomczak-doc.json +1 -0
  28. data/users/jtomczak/seedset-jtomczak-maple.json +0 -0
  29. data/users/lsoldaini/embeds-lsoldaini-doc.npy +2 -2
  30. data/users/lsoldaini/embeds-lsoldaini-sent.pickle +2 -2
  31. data/users/lsoldaini/pid2idx-lsoldaini-doc.json +1 -1
  32. data/users/lsoldaini/seedset-lsoldaini-maple.json +366 -34
  33. data/users/nmahyar/embeds-nmahyar-doc.npy +2 -2
  34. data/users/nmahyar/embeds-nmahyar-sent.pickle +2 -2
  35. data/users/nmahyar/pid2idx-nmahyar-doc.json +1 -1
  36. data/users/nmahyar/seedset-nmahyar-maple.json +417 -42
  37. data/users/nshah/embeds-nshah-doc.npy +2 -2
  38. data/users/nshah/embeds-nshah-sent.pickle +2 -2
  39. data/users/nshah/pid2idx-nshah-doc.json +1 -1
  40. data/users/nshah/seedset-nshah-maple.json +0 -0
  41. data/users/smysore/embeds-smysore-doc.npy +3 -0
  42. data/users/smysore/embeds-smysore-sent.pickle +3 -0
  43. data/users/smysore/pid2idx-smysore-doc.json +1 -0
  44. data/users/smysore/seedset-smysore-maple.json +230 -0
  45. data/users/upaquet/embeds-upaquet-doc.npy +3 -0
  46. data/users/upaquet/embeds-upaquet-sent.pickle +3 -0
  47. data/users/upaquet/pid2idx-upaquet-doc.json +1 -0
  48. data/users/upaquet/seedset-upaquet-maple.json +523 -0
data/users/afan/embeds-afan-doc.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77e6a9f80e2c1a8b73ec58b70a6261c4729858e85e431c7e743a9c70dee5f23f
3
+ size 368768
data/users/afan/embeds-afan-sent.pickle ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c43f9b3c1f25afffd5a92b60358ff56ed501f87fcf2645254201d5038f40827f
3
+ size 1252858
data/users/afan/pid2idx-afan-doc.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49, "50": 50, "51": 51, "52": 52, "53": 53, "54": 54, "55": 55, "56": 56, "57": 57, "58": 58, "59": 59}
data/users/afan/seedset-afan-maple.json ADDED
@@ -0,0 +1,735 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "afan",
3
+ "s2_authorid": "144270981",
4
+ "papers": [
5
+ {
6
+ "title": "Llama 2: Open Foundation and Fine-Tuned Chat Models",
7
+ "abstract": [
8
+ "In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.",
9
+ "Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.",
10
+ "Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.",
11
+ "We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs."
12
+ ]
13
+ },
14
+ {
15
+ "title": "Ngambay-French Neural Machine Translation (sba-Fr)",
16
+ "abstract": [
17
+ "In Africa, and the world at large, there is an increasing focus on developing Neural Machine Translation (NMT) systems to overcome language barriers.",
18
+ "NMT for Low-resource language is particularly compelling as it involves learning with limited labelled data.",
19
+ "However, obtaining a well-aligned parallel corpus for low-resource languages can be challenging.",
20
+ "The disparity between the technological advancement of a few global languages and the lack of research on NMT for local languages in Chad is striking.",
21
+ "End-to-end NMT trials on low-resource Chad languages have not been attempted.",
22
+ "Additionally, there is a dearth of online and well-structured data gathering for research in Natural Language Processing, unlike some African languages.",
23
+ "However, a guided approach for data gathering can produce bitext data for many Chadian language translation pairs with well-known languages that have ample data.",
24
+ "In this project, we created the first sba-Fr Dataset, which is a corpus of Ngambay-to-French translations, and fine-tuned three pre-trained models using this dataset.",
25
+ "Our experiments show that the M2M100 model outperforms other models with high BLEU scores on both original and original+synthetic data.",
26
+ "The publicly available bitext dataset can be used for research purposes."
27
+ ]
28
+ },
29
+ {
30
+ "title": "Revisiting Machine Translation for Cross-lingual Classification",
31
+ "abstract": [
32
+ "Machine Translation (MT) has been widely used for cross-lingual classification, either by translating the test set into English and running inference with a monolingual model (translate-test), or translating the training set into the target languages and finetuning a multilingual model (translate-train).",
33
+ "However, most research in the area focuses on the multilingual models rather than the MT component.",
34
+ "We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.",
35
+ "The optimal approach, however, is highly task dependent, as we identify various sources of cross-lingual transfer gap that affect different tasks and approaches differently.",
36
+ "Our work calls into question the dominance of multilingual models for cross-lingual classification, and prompts to pay more attention to MT-based baselines."
37
+ ]
38
+ },
39
+ {
40
+ "title": "Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation",
41
+ "abstract": [
42
+ "For many languages, machine translation progress is hindered by the lack of reliable training data.",
43
+ "Models are trained on whatever pre-existing datasets may be available and then augmented with synthetic data, because it is often not economical to pay for the creation of large-scale datasets.",
44
+ "But for the case of low-resource languages, would the creation of a few thousand professionally translated sentence pairs give any benefit?",
45
+ "In this paper, we show that it does.",
46
+ "We describe a broad data collection effort involving around 6k professionally translated sentence pairs for each of 39 low-resource languages, which we make publicly available.",
47
+ "We analyse the gains of models trained on this small but high-quality data, showing that it has significant impact even when larger but lower quality pre-existing corpora are used, or when data is augmented with millions of sentences through backtranslation."
48
+ ]
49
+ },
50
+ {
51
+ "title": "A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation",
52
+ "abstract": [
53
+ "Recent advances in the pre-training for language models leverage large-scale datasets to create multilingual models.",
54
+ "However, low-resource languages are mostly left out in these datasets.",
55
+ "This is primarily because many widely spoken languages that are not well represented on the web and therefore excluded from the large-scale crawls for datasets.",
56
+ "Furthermore, downstream users of these models are restricted to the selection of languages originally chosen for pre-training.",
57
+ "This work investigates how to optimally leverage existing pre-trained models to create low-resource translation systems for 16 African languages.",
58
+ "We focus on two questions: 1) How can pre-trained models be used for languages not included in the initial pretraining?",
59
+ "and 2) How can the resulting translation models effectively transfer to new domains?",
60
+ "To answer these questions, we create a novel African news corpus covering 16 languages, of which eight languages are not part of any existing evaluation dataset.",
61
+ "We demonstrate that the most effective strategy for transferring both additional languages and additional domains is to leverage small quantities of high-quality translation data to fine-tune large pre-trained models."
62
+ ]
63
+ },
64
+ {
65
+ "title": "Reasoning over Public and Private Data in Retrieval-Based Systems",
66
+ "abstract": [
67
+ "Abstract Users an organizations are generating ever-increasing amounts of private data from a wide range of sources.",
68
+ "Incorporating private context is important to personalize open-domain tasks such as question-answering, fact-checking, and personal assistants.",
69
+ "State-of-the-art systems for these tasks explicitly retrieve information that is relevant to an input question from a background corpus before producing an answer.",
70
+ "While today\u2019s retrieval systems assume relevant corpora are fully (e.g., publicly) accessible, users are often unable or unwilling to expose their private data to entities hosting public data.",
71
+ "We define the Split Iterative Retrieval (SPIRAL) problem involving iterative retrieval over multiple privacy scopes.",
72
+ "We introduce a foundational benchmark with which to study SPIRAL, as no existing benchmark includes data from a private distribution.",
73
+ "Our dataset, ConcurrentQA, includes data from distinct public and private distributions and is the first textual QA benchmark requiring concurrent retrieval over multiple distributions.",
74
+ "Finally, we show that existing retrieval approaches face significant performance degradations when applied to our proposed retrieval setting and investigate approaches with which these tradeoffs can be mitigated.",
75
+ "We release the new benchmark and code to reproduce the results.1"
76
+ ]
77
+ },
78
+ {
79
+ "title": "stopes - Modular Machine Translation Pipelines",
80
+ "abstract": [
81
+ "Neural machine translation, as other natural language deep learning applications, is hungry for data.",
82
+ "As research evolves, the data pipelines supporting that research evolve too, oftentimes re-implementing the same core components.",
83
+ "Despite the potential of modular codebases, researchers have but little time to put code structure and reusability first.",
84
+ "Unfortunately, this makes it very hard to publish clean, reproducible code to benefit a wider audience.",
85
+ "In this paper, we motivate and describe stopes , a framework that addresses these issues while empowering scalability and versatility for research use cases.",
86
+ "This library was a key enabler of the No Language Left Behind project, establishing new state of the art performance for a multilingual machine translation model covering 200 languages.",
87
+ "stopes and the pipelines described are released under the MIT license at https://github.com/ facebookresearch/stopes."
88
+ ]
89
+ },
90
+ {
91
+ "title": "Generating Biographies on Wikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies",
92
+ "abstract": [
93
+ "Generating factual, long-form text such as Wikipedia articles raises three key challenges: how to gather relevant evidence, how to structure information into well-formed text, and how to ensure that the generated text is factually correct.",
94
+ "We address these by developing a model for English text that uses a retrieval mechanism to identify relevant supporting information on the web and a cache-based pre-trained encoder-decoder to generate long-form biographies section by section, including citation information.",
95
+ "To assess the impact of available web evidence on the output text, we compare the performance of our approach when generating biographies about women (for which less information is available on the web) vs. biographies generally.",
96
+ "To this end, we curate a dataset of 1,500 biographies about women.",
97
+ "We analyze our generated text to understand how differences in available web evidence data affect generation.",
98
+ "We evaluate the factuality, fluency, and quality of the generated texts using automatic metrics and human evaluation.",
99
+ "We hope that these techniques can be used as a starting point for human writers, to aid in reducing the complexity inherent in the creation of long-form, factual text."
100
+ ]
101
+ },
102
+ {
103
+ "title": "RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question",
104
+ "abstract": [
105
+ "Existing metrics for evaluating the quality of automatically generated questions such as BLEU, ROUGE, BERTScore, and BLEURT compare the reference and predicted questions, providing a high score when there is a considerable lexical overlap or semantic similarity between the candidate and the reference questions.",
106
+ "This approach has two major shortcomings.",
107
+ "First, we need expensive human-provided reference questions.",
108
+ "Second, it penalises valid questions that may not have high lexical or semantic similarity to the reference questions.",
109
+ "In this paper, we propose a new metric, RQUGE, based on the answerability of the candidate question given the context.",
110
+ "The metric consists of a question-answering and a span scorer modules, using pre-trained models from existing literature, thus it can be used without any further training.",
111
+ "We demonstrate that RQUGE has a higher correlation with human judgment without relying on the reference question.",
112
+ "Additionally, RQUGE is shown to be more robust to several adversarial corruptions.",
113
+ "Furthermore, we illustrate that we can significantly improve the performance of QA models on out-of-domain datasets by fine-tuning on synthetic data generated by a question generation model and re-ranked by RQUGE."
114
+ ]
115
+ },
116
+ {
117
+ "title": "MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases",
118
+ "abstract": [
119
+ "Progress in sentence simplification has been hindered by a lack of labeled parallel simplification data, particularly in languages other than English.",
120
+ "We introduce MUSS, a Multilingual Unsupervised Sentence Simplification system that does not require labeled simplification data.",
121
+ "MUSS uses a novel approach to sentence simplification that trains strong models using sentence-level paraphrase data instead of proper simplification data.",
122
+ "These models leverage unsupervised pretraining and controllable generation mechanisms to flexibly adjust attributes such as length and lexical complexity at inference time.",
123
+ "We further present a method to mine such paraphrase data in any language from Common Crawl using semantic sentence embeddings, thus removing the need for labeled data.",
124
+ "We evaluate our approach on English, French, and Spanish simplification benchmarks and closely match or outperform the previous best supervised results, despite not using any labeled simplification data.",
125
+ "We push the state of the art further by incorporating labeled simplification data."
126
+ ]
127
+ },
128
+ {
129
+ "title": "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model",
130
+ "abstract": [
131
+ "Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.",
132
+ "While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public.",
133
+ "As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers.",
134
+ "BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total).",
135
+ "We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning.",
136
+ "To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License."
137
+ ]
138
+ },
139
+ {
140
+ "title": "AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas",
141
+ "abstract": [
142
+ "Little attention has been paid to the development of human language technology for truly low-resource languages\u2014i.e., languages with limited amounts of digitally available text data, such as Indigenous languages.",
143
+ "However, it has been shown that pretrained multilingual models are able to perform crosslingual transfer in a zero-shot setting even for low-resource languages which are unseen during pretraining.",
144
+ "Yet, prior work evaluating performance on unseen languages has largely been limited to shallow token-level tasks.",
145
+ "It remains unclear if zero-shot learning of deeper semantic tasks is possible for unseen languages.",
146
+ "To explore this question, we present AmericasNLI, a natural language inference dataset covering 10 Indigenous languages of the Americas.",
147
+ "We conduct experiments with pretrained models, exploring zero-shot learning in combination with model adaptation.",
148
+ "Furthermore, as AmericasNLI is a multiway parallel dataset, we use it to benchmark the performance of different machine translation models for those languages.",
149
+ "Finally, using a standard transformer model, we explore translation-based approaches for natural language inference.",
150
+ "We find that the zero-shot performance of pretrained models without adaptation is poor for all languages in AmericasNLI, but model adaptation via continued pretraining results in improvements.",
151
+ "All machine translation models are rather weak, but, surprisingly, translation-based approaches to natural language inference outperform all other models on that task."
152
+ ]
153
+ },
154
+ {
155
+ "title": "No Language Left Behind: Scaling Human-Centered Machine Translation",
156
+ "abstract": [
157
+ "Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today.",
158
+ "However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages.",
159
+ "What does it take to break the 200 language barrier while ensuring safe, high quality results, all while keeping ethical considerations in mind?",
160
+ "In No Language Left Behind, we took on this challenge by first contextualizing the need for low-resource language translation support through exploratory interviews with native speakers.",
161
+ "Then, we created datasets and models aimed at narrowing the performance gap between low and high-resource languages.",
162
+ "More specifically, we developed a conditional compute model based on Sparsely Gated Mixture of Experts that is trained on data obtained with novel and effective data mining techniques tailored for low-resource languages.",
163
+ "We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks.",
164
+ "Critically, we evaluated the performance of over 40,000 different translation directions using a human-translated benchmark, Flores-200, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety.",
165
+ "Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art, laying important groundwork towards realizing a universal translation system.",
166
+ "Finally, we open source all contributions described in this work, accessible at https://github.com/facebookresearch/fairseq/tree/nllb."
167
+ ]
168
+ },
169
+ {
170
+ "title": "Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog",
171
+ "abstract": [
172
+ "Semantic parsing using sequence-to-sequence models allows parsing of deeper representations compared to traditional word tagging based models.",
173
+ "In spite of these advantages, widespread adoption of these models for real-time conversational use cases has been stymied by higher compute requirements and thus higher latency.",
174
+ "In this work, we propose a non-autoregressive approach to predict semantic parse trees with an efficient seq2seq model architecture.",
175
+ "By combining non-autoregressive prediction with convolutional neural networks, we achieve significant latency gains and parameter size reduction compared to traditional RNN models.",
176
+ "Our novel architecture achieves up to an 81% reduction in latency on TOP dataset and retains competitive performance to non-pretrained models on three different semantic parsing datasets."
177
+ ]
178
+ },
179
+ {
180
+ "title": "Facebook AI\u2019s WMT21 News Translation Task Submission",
181
+ "abstract": [
182
+ "We describe Facebook\u2019s multilingual model submission to the WMT2021 shared task on news translation.",
183
+ "We participate in 14 language directions: English to and from Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese.",
184
+ "To develop systems covering all these directions, we focus on multilingual models.",
185
+ "We utilize data from all available sources \u2014 WMT, large-scale data mining, and in-domain backtranslation \u2014 to create high quality bilingual and multilingual baselines.",
186
+ "Subsequently, we investigate strategies for scaling multilingual model size, such that one system has sufficient capacity for high quality representations of all eight languages.",
187
+ "Our final submission is an ensemble of dense and sparse Mixture-of-Expert multilingual translation models, followed by finetuning on in-domain news data and noisy channel reranking.",
188
+ "Compared to previous year\u2019s winning submissions, our multilingual system improved the translation quality on all language directions, with an average improvement of 2.0 BLEU.",
189
+ "In the WMT2021 task, our system ranks first in 10 directions based on automatic evaluation."
190
+ ]
191
+ },
192
+ {
193
+ "title": "Tricks for Training Sparse Translation Models",
194
+ "abstract": [
195
+ "Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.",
196
+ "Sparse scaling architectures, such as BASELayers, provide flexible mechanisms for different tasks to have a variable number of parameters, which can be useful to counterbalance skewed data distributions.",
197
+ "We find that that sparse architectures for multilingual machine translation can perform poorly out of the box and propose two straightforward techniques to mitigate this \u2014 a temperature heating mechanism and dense pre-training.",
198
+ "Overall, these methods improve performance on two multilingual translation benchmarks compared to standard BASELayers and Dense scaling baselines, and in combination, more than 2x model convergence speed."
199
+ ]
200
+ },
201
+ {
202
+ "title": "Not All Memories are Created Equal: Learning to Forget by Expiring",
203
+ "abstract": [
204
+ "Attention mechanisms have shown promising results in sequence modeling tasks that require long-term memory.",
205
+ "Recent work investigated mechanisms to reduce the computational cost of preserving and storing memories.",
206
+ "However, not all content in the past is equally important to remember.",
207
+ "We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information.",
208
+ "This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently, as not all states from previous timesteps are preserved.",
209
+ "We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve strong performance on reinforcement learning tasks specifically designed to challenge this functionality.",
210
+ "Next, we show that Expire-Span can scale to memories that are tens of thousands in size, setting a new state of the art on incredibly long context tasks such as character-level language modeling and a frame-by-frame moving objects task.",
211
+ "Finally, we analyze the efficiency of Expire-Span compared to existing approaches and demonstrate that it trains faster and uses less memory."
212
+ ]
213
+ },
214
+ {
215
+ "title": "The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation",
216
+ "abstract": [
217
+ "One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks.",
218
+ "Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures.",
219
+ "In this work, we introduce the Flores-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains.",
220
+ "These sentences have been translated in 101 languages by professional translators through a carefully controlled process.",
221
+ "The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are fully aligned.",
222
+ "By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond."
223
+ ]
224
+ },
225
+ {
226
+ "title": "EXTREME MODEL COMPRESSION",
227
+ "abstract": [
228
+ "We tackle the problem of producing compact models, maximizing their accuracy for a given model size.",
229
+ "A standard solution is to train networks with Quantization Aware Training (Jacob et al., 2018), where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator (Bengio et al., 2013).",
230
+ "In this paper, we extend this approach to work beyond int8 fixedpoint quantization with extreme compression methods where the approximations introduced by STE are severe, such as Product Quantization.",
231
+ "Our proposal is to only quantize a different random subset of weights during each forward, allowing for unbiased gradients to flow through the other weights.",
232
+ "Controlling the amount of noise and its form allows for extreme compression rates while maintaining the performance of the original model.",
233
+ "As a result we establish new state-of-the-art compromises between accuracy and model size both in natural language processing and image classification.",
234
+ "For example, applying our method to state-of-the-art Transformer and ConvNet architectures, we can achieve 82.5% accuracy on MNLI by compressing RoBERTa to 14 MB and 80.0% top-1 accuracy on ImageNet by compressing an EfficientNet-B3 to 3.3 MB.",
235
+ "1"
236
+ ]
237
+ },
238
+ {
239
+ "title": "AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages",
240
+ "abstract": [
241
+ "Pretrained multilingual models are able to perform cross-lingual transfer in a zero-shot setting, even for languages unseen during pretraining.",
242
+ "However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages.",
243
+ "To explore this question, we present AmericasNLI, an extension of XNLI (Conneau et al., 2018) to 10 Indigenous languages of the Americas.",
244
+ "We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches.",
245
+ "Additionally, we explore model adaptation via continued pretraining and provide an analysis of the dataset by considering hypothesis-only models.",
246
+ "We find that XLM-R\u2019s zero-shot performance is poor for all 10 languages, with an average performance of 38.48%.",
247
+ "Continued pretraining offers improvements, with an average accuracy of 43.85%.",
248
+ "Surprisingly, training on poorly translated data by far outperforms all other methods with an accuracy of 49.12%."
249
+ ]
250
+ },
251
+ {
252
+ "title": "Alternative Input Signals Ease Transfer in Multilingual Machine Translation",
253
+ "abstract": [
254
+ "Recent work in multilingual machine translation (MMT) has focused on the potential of positive transfer between languages, particularly cases where higher-resourced languages can benefit lower-resourced ones.",
255
+ "While training an MMT model, the supervision signals learned from one language pair can be transferred to the other via the tokens shared by multiple source languages.",
256
+ "However, the transfer is inhibited when the token overlap among source languages is small, which manifests naturally when languages use different writing systems.",
257
+ "In this paper, we tackle inhibited transfer by augmenting the training data with alternative signals that unify different writing systems, such as phonetic, romanized, and transliterated input.",
258
+ "We test these signals on Indic and Turkic languages, two language families where the writing systems differ but languages still share common features.",
259
+ "Our results indicate that a straightforward multi-source self-ensemble \u2013 training a model on a mixture of various signals and ensembling the outputs of the same model fed with different signals during inference, outperforms strong ensemble baselines by 1.3 BLEU points on both language families.",
260
+ "Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible.",
261
+ "Finally, our analysis demonstrates that including alternative signals yields more consistency and translates named entities more accurately, which is crucial for increased factuality of automated systems."
262
+ ]
263
+ },
264
+ {
265
+ "title": "Multilingual Translation from Denoising Pre-Training",
266
+ "abstract": [
267
+ "Recent work demonstrates the potential of training one model for multilingual machine translation.",
268
+ "In parallel, denoising pretraining using unlabeled monolingual data as a starting point for \ufb01netuning bitext machine translation systems has demonstrated strong performance gains.",
269
+ "However, little has been explored on the potential to combine denoising pretraining with multilingual machine translation in a single model.",
270
+ "In this work, we \ufb01ll this gap by studying how multilingual translation models can be created through multilingual \ufb01netuning .",
271
+ "Fintuning multilingual model from a denoising pretrained model incorporates the bene\ufb01ts of large quantities of unlabeled monolingual data, which is particularly important for low resource languages where bitext is rare.",
272
+ "Further, we create the ML50 benchmark to facilitate re-producible research by standardizing training and evaluation data.",
273
+ "On ML50, we show that multilingual \ufb01netuning signi\ufb01cantly improves over multilingual models trained from scratch and bilingual \ufb01netuning for translation into English.",
274
+ "We also \ufb01nd that multilingual \ufb01ne-tuning can signi\ufb01cantly improve over multilingual models trained from scratch for zero-shot translation on non-English directions.",
275
+ "Finally, we discuss that the pretraining and \ufb01netuning paradigm alone is not enough to address the challenges of multilingual models for to-Many directions performance."
276
+ ]
277
+ },
278
+ {
279
+ "title": "Findings of the WMT 2021 Shared Task on Large-Scale Multilingual Machine Translation",
280
+ "abstract": [
281
+ "We present the results of the first task on Large-Scale Multilingual Machine Translation.",
282
+ "The task consists on the many-to-many evaluation of a single model across a variety of source and target languages.",
283
+ "This year, the task consisted on three different settings: (i) SMALL-TASK1 (Central/South-Eastern European Languages), (ii) the SMALL-TASK2 (South-East Asian Languages), and (iii) FULL-TASK (all 101 x 100 language pairs).",
284
+ "All the tasks used the FLORES-101 dataset as the evaluation benchmark.",
285
+ "To ensure the longevity of the dataset, the test sets were not publicly released and the models were evaluated in a controlled environment on Dynabench.",
286
+ "There were a total of 10 participating teams for the tasks, with a total of 151 intermediate model submissions and 13 final models.",
287
+ "This year\u2019s result show a significant improvement over the known base-lines with +17.8 BLEU for SMALL-TASK2, +10.6 for FULL-TASK and +3.6 for SMALL-TASK1."
288
+ ]
289
+ },
290
+ {
291
+ "title": "Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas",
292
+ "abstract": [
293
+ "This paper presents the results of the 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas.",
294
+ "The shared task featured two independent tracks, and participants submitted machine translation systems for up to 10 indigenous languages.",
295
+ "Overall, 8 teams participated with a total of 214 submissions.",
296
+ "We provided training sets consisting of data collected from various sources, as well as manually translated sentences for the development and test sets.",
297
+ "An official baseline trained on this data was also provided.",
298
+ "Team submissions featured a variety of architectures, including both statistical and neural models, and for the majority of languages, many teams were able to considerably improve over the baseline.",
299
+ "The best performing systems achieved 12.97 ChrF higher than baseline, when averaged across languages."
300
+ ]
301
+ },
302
+ {
303
+ "title": "Findings of the 2021 Conference on Machine Translation (WMT21)",
304
+ "abstract": [
305
+ "This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021.In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.",
306
+ "The taskwas also opened up to additional test suites toprobe specific aspects of translation."
307
+ ]
308
+ },
309
+ {
310
+ "title": "Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations",
311
+ "abstract": [
312
+ "While research on explaining predictions of open-domain QA systems (ODQA) is gaining momentum, most works do not evaluate whether these explanations improve user trust.",
313
+ "Furthermore, many users interact with ODQA using voice-assistants, yet prior works exclusively focus on visual displays, risking (as we also show) incorrectly extrapolating the effectiveness of explanations across modalities.",
314
+ "To better understand the effectiveness of ODQA explanations strategies in the wild, we conduct user studies that measure whether explanations help users correctly decide when to accept or reject an ODQA system\u2019s answer.",
315
+ "Unlike prior work, we control for explanation modality, i.e., whether they are communicated to users through a spoken or visual interface, and contrast effectiveness across modalities.",
316
+ "We show that explanations derived from retrieved evidence can outperform strong baselines across modalities but the best explanation strategy varies with the modality.",
317
+ "We show common failure cases of current explanations, emphasize end-to-end evaluation of explanations, and caution against evaluating them in proxy modalities that differ from deployment."
318
+ ]
319
+ },
320
+ {
321
+ "title": "Training with Quantization Noise for Extreme Model Compression",
322
+ "abstract": [
323
+ "We tackle the problem of producing compact models, maximizing their accuracy for a given model size.",
324
+ "A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.",
325
+ "In this paper, we extend this approach to work beyond int8 fixed-point quantization with extreme compression methods where the approximations introduced by STE are severe, such as Product Quantization.",
326
+ "Our proposal is to only quantize a different random subset of weights during each forward, allowing for unbiased gradients to flow through the other weights.",
327
+ "Controlling the amount of noise and its form allows for extreme compression rates while maintaining the performance of the original model.",
328
+ "As a result we establish new state-of-the-art compromises between accuracy and model size both in natural language processing and image classification.",
329
+ "For example, applying our method to state-of-the-art Transformer and ConvNet architectures, we can achieve 82.5% accuracy on MNLI by compressing RoBERTa to 14MB and 80.0 top-1 accuracy on ImageNet by compressing an EfficientNet-B3 to 3.3MB."
330
+ ]
331
+ },
332
+ {
333
+ "title": "Facebook AI\u2019s WMT20 News Translation Task Submission",
334
+ "abstract": [
335
+ "This paper describes Facebook AI\u2019s submission to WMT20 shared news translation task.",
336
+ "We focus on the low resource setting and participate in two language pairs, Tamil <-> English and Inuktitut <-> English, where there are limited out-of-domain bitext and monolingual data.",
337
+ "We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.",
338
+ "We explore techniques that leverage bitext and monolingual data from all languages, such as self-supervised model pretraining, multilingual models, data augmentation, and reranking.",
339
+ "To better adapt the translation system to the test domain, we explore dataset tagging and fine-tuning on in-domain data.",
340
+ "We observe that different techniques provide varied improvements based on the available data of the language pair.",
341
+ "Based on the finding, we integrate these techniques into one training pipeline.",
342
+ "For En->Ta, we explore an unconstrained setup with additional Tamil bitext and monolingual data and show that further improvement can be obtained.",
343
+ "On the test set, our best submitted systems achieve 21.5 and 13.7 BLEU for Ta->En and En->Ta respectively, and 27.9 and 13.0 for Iu->En and En->Iu respectively."
344
+ ]
345
+ },
346
+ {
347
+ "title": "KILT: a Benchmark for Knowledge Intensive Language Tasks",
348
+ "abstract": [
349
+ "Challenging problems such as open-domain question answering, fact checking, slot filling and entity linking require access to large, external knowledge sources.",
350
+ "While some models do well on individual tasks, developing general models is difficult as each task might require computationally expensive indexing of custom knowledge sources, in addition to dedicated infrastructure.",
351
+ "To catalyze research on models that condition on specific information in large textual resources, we present a benchmark for knowledge-intensive language tasks (KILT).",
352
+ "All tasks in KILT are grounded in the same snapshot of Wikipedia, reducing engineering turnaround through the re-use of components, as well as accelerating research into task-agnostic memory architectures.",
353
+ "We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance.",
354
+ "We find that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text.",
355
+ "KILT data and code are available at https://github.com/facebookresearch/KILT."
356
+ ]
357
+ },
358
+ {
359
+ "title": "Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions",
360
+ "abstract": [
361
+ "We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.",
362
+ "We present a biased view, focusing on work done by our own group, while citing related work in each area.",
363
+ "In particular, we discuss in detail the properties of continual learning, providing engaging content, and being well-behaved -- and how to measure success in providing them.",
364
+ "We end with a discussion of our experience and learnings, and our recommendations to the community."
365
+ ]
366
+ },
367
+ {
368
+ "title": "Multilingual Translation with Extensible Multilingual Pretraining and Finetuning",
369
+ "abstract": [
370
+ "Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages.",
371
+ "Previous work in multilingual pretraining has demonstrated that machine translation systems can be created by finetuning on bitext.",
372
+ "In this work, we show that multilingual translation models can be created through multilingual finetuning.",
373
+ "Instead of finetuning on one direction, a pretrained model is finetuned on many directions at the same time.",
374
+ "Compared to multilingual models trained from scratch, starting from pretrained models incorporates the benefits of large quantities of unlabeled monolingual data, which is particularly important for low resource languages where bitext is not available.",
375
+ "We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance.",
376
+ "We double the number of languages in mBART to support multilingual machine translation models of 50 languages.",
377
+ "Finally, we create the ML50 benchmark, covering low, mid, and high resource languages, to facilitate reproducible research by standardizing training and evaluation data.",
378
+ "On ML50, we demonstrate that multilingual finetuning improves on average 1 BLEU over the strongest baselines (being either multilingual from scratch or bilingual finetuning) while improving 9.3 BLEU on average over bilingual baselines from scratch."
379
+ ]
380
+ },
381
+ {
382
+ "title": "Nearest Neighbor Machine Translation",
383
+ "abstract": [
384
+ "We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search.",
385
+ "This approach requires no additional training and scales to give the decoder direct access to billions of examples at test time, resulting in a highly expressive model that consistently improves performance across many settings.",
386
+ "Simply adding nearest neighbor search improves a state-of-the-art German-English translation model by 1.5 BLEU.",
387
+ "$k$NN-MT allows a single model to be adapted to diverse domains by using a domain-specific datastore, improving results by an average of 9.2 BLEU over zero-shot transfer, and achieving new state-of-the-art results---without training on these domains.",
388
+ "A massively multilingual model can also be specialized for particular language pairs, with improvements of 3 BLEU for translating from English into German and Chinese.",
389
+ "Qualitatively, $k$NN-MT is easily interpretable; it combines source and target context to retrieve highly relevant examples."
390
+ ]
391
+ },
392
+ {
393
+ "title": "Multilingual AMR-to-Text Generation",
394
+ "abstract": [
395
+ "Generating text from structured data is challenging because it requires bridging the gap between (i) structure and natural language (NL) and (ii) semantically underspecified input and fully specified NL output.",
396
+ "Multilingual generation brings in an additional challenge: that of generating into languages with varied word order and morphological properties.",
397
+ "In this work, we focus on Abstract Meaning Representations (AMRs) as structured input, where previous research has overwhelmingly focused on generating only into English.",
398
+ "We leverage advances in cross-lingual embeddings, pretraining, and multilingual models to create multilingual AMR-to-text models that generate in twenty one different languages.",
399
+ "For eighteen languages, based on automatic metrics, our multilingual models surpass baselines that generate into a single language.",
400
+ "We analyse the ability of our multilingual models to accurately capture morphology and word order using human evaluation, and find that native speakers judge our generations to be fluent."
401
+ ]
402
+ },
403
+ {
404
+ "title": "Training with Quantization Noise for Extreme Fixed-Point Compression",
405
+ "abstract": [
406
+ "We tackle the problem of producing compact models, maximizing their accuracy for a given model size.",
407
+ "A standard solution is to train networks with Quantization Aware Training [1], where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator [2].",
408
+ "In this paper, we extend this approach to work with extreme compression methods where the approximations introduced by STE are severe.",
409
+ "Our proposal is to only quantize a different random subset of weights during each forward, allowing for unbiased gradients to flow through the other weights.",
410
+ "Controlling the amount of noise and its form allows for extreme compression rates while maintaining the performance of the original model.",
411
+ "As a result we establish new state-of-the-art compromises between accuracy and model size both in natural language processing and image classification.",
412
+ "For example, applying our method to state-of-the-art Transformer and ConvNet architectures, we can achieve 82.5% accuracy on MNLI by compressing RoBERTa to 14 MB and 80.0% top-1 accuracy on ImageNet by compressing an EfficientNet-B3 to 3.3 MB."
413
+ ]
414
+ },
415
+ {
416
+ "title": "Multilingual Unsupervised Sentence Simplification",
417
+ "abstract": [
418
+ "Progress in Sentence Simplification has been hindered by the lack of supervised data, particularly in languages other than English.",
419
+ "Previous work has aligned sentences from original and simplified corpora such as English Wikipedia and Simple English Wikipedia, but this limits corpus size, domain, and language.",
420
+ "In this work, we propose using unsupervised mining techniques to automatically create training corpora for simplification in multiple languages from raw Common Crawl web data.",
421
+ "When coupled with a controllable generation mechanism that can flexibly adjust attributes such as length and lexical complexity, these mined paraphrase corpora can be used to train simplification systems in any language.",
422
+ "We further incorporate multilingual unsupervised pretraining methods to create even stronger models and show that by training on mined data rather than supervised corpora, we outperform the previous best results.",
423
+ "We evaluate our approach on English, French, and Spanish simplification benchmarks and reach state-of-the-art performance with a totally unsupervised approach.",
424
+ "We will release our models and code to mine the data in any language included in Common Crawl."
425
+ ]
426
+ },
427
+ {
428
+ "title": "Beyond English-Centric Multilingual Machine Translation",
429
+ "abstract": [
430
+ "Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages.",
431
+ "However, much of this work is English-Centric by training only on data which was translated from or to English.",
432
+ "While this is supported by large sources of training data, it does not reflect translation needs worldwide.",
433
+ "In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages.",
434
+ "We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining.",
435
+ "Then, we explore how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models.",
436
+ "Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT.",
437
+ "We open-source our scripts so that others may reproduce the data, evaluation, and final M2M-100 model."
438
+ ]
439
+ },
440
+ {
441
+ "title": "Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA",
442
+ "abstract": [
443
+ "While research on explaining predictions of open-domain QA systems (ODQA) to users is gaining momentum, most works have failed to evaluate the extent to which explanations improve user trust.",
444
+ "While few works evaluate explanations using user studies, they employ settings that may deviate from the end-user's usage in-the-wild: ODQA is most ubiquitous in voice-assistants, yet current research only evaluates explanations using a visual display, and may erroneously extrapolate conclusions about the most performant explanations to other modalities.",
445
+ "To alleviate these issues, we conduct user studies that measure whether explanations help users correctly decide when to accept or reject an ODQA system's answer.",
446
+ "Unlike prior work, we control for explanation modality, e.g., whether they are communicated to users through a spoken or visual interface, and contrast effectiveness across modalities.",
447
+ "Our results show that explanations derived from retrieved evidence passages can outperform strong baselines (calibrated confidence) across modalities but the best explanation strategy in fact changes with the modality.",
448
+ "We show common failure cases of current explanations, emphasize end-to-end evaluation of explanations, and caution against evaluating them in proxy modalities that are different from deployment."
449
+ ]
450
+ },
451
+ {
452
+ "title": "Improving Text-to-Text Pre-trained Models for the Graph-to-Text Task",
453
+ "abstract": [
454
+ "Converting a knowledge graph or sub-graph to natural text is useful when answering questions based on a knowledge base.",
455
+ "Highcapacity language models pre-trained on largescale text corpora have recently been shown to be powerful when fine-tuned for the knowledge-graph-to-text (KG-to-text) task.",
456
+ "In this paper, we propose two classes of methods to improve such pre-trained models for this task.",
457
+ "First, we improve the structure awareness of the model by organizing the input as well as learning optimal ordering via multitask learning.",
458
+ "Second, we bridge the domain gap between text-to-text and KG-to-text tasks via a second-phase KG-to-text pre-training on similar datasets and extra lexicalization supervision to make the input more similar to natural text.",
459
+ "We demonstrate the efficacy of our methods on the popular WebNLG dataset.",
460
+ "Our best model achieves an almost 3 point BLEU improvement on a strong baseline while lowering the relative slot-error-rate by around 35%.",
461
+ "We also validate our results via human evaluation."
462
+ ]
463
+ },
464
+ {
465
+ "title": "Generating Fact Checking Briefs",
466
+ "abstract": [
467
+ "Fact checking at scale is difficult -- while the number of active fact checking websites is growing, it remains too small for the needs of the contemporary media ecosystem.",
468
+ "However, despite good intentions, contributions from volunteers are often error-prone, and thus in practice restricted to claim detection.",
469
+ "We investigate how to increase the accuracy and efficiency of fact checking by providing information about the claim before performing the check, in the form of natural language briefs.",
470
+ "We investigate passage-based briefs, containing a relevant passage from Wikipedia, entity-centric ones consisting of Wikipedia pages of mentioned entities, and Question-Answering Briefs, with questions decomposing the claim, and their answers.",
471
+ "To produce QABriefs, we develop QABriefer, a model that generates a set of questions conditioned on the claim, searches the web for evidence, and generates answers.",
472
+ "To train its components, we introduce QABriefDataset which we collected via crowdsourcing.",
473
+ "We show that fact checking with briefs -- in particular QABriefs -- increases the accuracy of crowdworkers by 10% while slightly decreasing the time taken.",
474
+ "For volunteer (unpaid) fact checkers, QABriefs slightly increase accuracy and reduce the time required by around 20%."
475
+ ]
476
+ },
477
+ {
478
+ "title": "Accessing Higher-level Representations in Sequential Transformers with Feedback Memory",
479
+ "abstract": [
480
+ "Transformers are feedforward networks that can process input tokens in parallel.",
481
+ "While this parallelization makes them computationally efficient, it restricts the model from fully exploiting the sequential nature of the input - the representation at a given layer can only access representations from lower layers, rather than the higher level representations already built in previous time steps.",
482
+ "In this work, we propose the Feedback Transformer architecture that exposes all previous representations to all future representations, meaning the lowest representation of the current timestep is formed from the highest-level abstract representation of the past.",
483
+ "We demonstrate on a variety of benchmarks in language modeling, neural machine translation, summarization, and reinforcement learning that the increased representation capacity can improve over Transformer baselines."
484
+ ]
485
+ },
486
+ {
487
+ "title": "Augmenting Transformers with KNN-Based Composite Memory for Dialog",
488
+ "abstract": [
489
+ "Various machine learning tasks can benefit from access to external information of different modalities, such as text and images.",
490
+ "Recent work has focused on learning architectures with large memories capable of storing this knowledge.",
491
+ "We propose augmenting generative Transformer neural networks with KNN-based Information Fetching (KIF) modules.",
492
+ "Each KIF module learns a read operation to access fixed external knowledge.",
493
+ "We apply these modules to generative dialog modeling, a challenging task where information must be flexibly retrieved and incorporated to maintain the topic and flow of conversation.",
494
+ "We demonstrate the effectiveness of our approach by identifying relevant knowledge required for knowledgeable but engaging dialog from Wikipedia, images, and human-written dialog utterances, and show that leveraging this retrieved information improves model performance, measured by automatic and human evaluation."
495
+ ]
496
+ },
497
+ {
498
+ "title": "Multi-Dimensional Gender Bias Classification",
499
+ "abstract": [
500
+ "Machine learning models are trained to find patterns in data.",
501
+ "NLP models can inadvertently learn socially undesirable patterns when training on gender biased text.",
502
+ "In this work, we propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions: bias from the gender of the person being spoken about, bias from the gender of the person being spoken to, and bias from the gender of the speaker.",
503
+ "Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.",
504
+ "In addition, we collect a novel, crowdsourced evaluation benchmark of utterance-level gender rewrites.",
505
+ "Distinguishing between gender bias along multiple dimensions is important, as it enables us to train finer-grained gender bias classifiers.",
506
+ "We show our classifiers prove valuable for a variety of important applications, such as controlling for gender bias in generative models, detecting gender bias in arbitrary text, and shed light on offensive language in terms of genderedness."
507
+ ]
508
+ },
509
+ {
510
+ "title": "with KNN-Based Composite Memory for Dialog",
511
+ "abstract": [
512
+ "Various machine learning tasks can benefit from access to external information of different modalities, such as text and images.",
513
+ "Recent work has focused on learning architectures with large memories capable of storing this knowledge.",
514
+ "We propose augmenting generative Transformer neural networks with KNN-based Information Fetching (KIF) modules.",
515
+ "Each KIF module learns a read operation to access fixed external knowledge.",
516
+ "We apply these modules to generative dialog modeling, a challenging task where information must be flexibly retrieved and incorporated to maintain the topic and flow of conversation.",
517
+ "We demonstrate the effectiveness of our approach by identifying relevant knowledge required for knowledgeable but engaging dialog from Wikipedia, images, and human-written dialog utterances, and show that leveraging this retrieved information improves model performance, measured by automatic and human evaluation."
518
+ ]
519
+ },
520
+ {
521
+ "title": "Queens Are Powerful Too: Mitigating Gender Bias in Dialogue Generation",
522
+ "abstract": [
523
+ "Models often easily learn biases present in the training data, and their predictions directly reflect this bias.",
524
+ "We analyze gender bias in dialogue data, and examine how this bias is actually amplified in subsequent generative chit-chat dialogue models.",
525
+ "We measure gender bias in six existing dialogue datasets, and focus on the most biased one, the multi-player text-based fantasy adventure dataset LIGHT, as a testbed for our bias mitigation techniques.",
526
+ "The LIGHT dataset is highly imbalanced with respect to gender, containing predominantly male characters, likely because it is entirely collected by crowdworkers and reflects common biases that exist in fantasy or medieval settings.",
527
+ "We consider three techniques to mitigate gender bias: counterfactual data augmentation, targeted data collection, and bias controlled training.",
528
+ "We show that our proposed techniques mitigate gender bias in LIGHT by balancing the genderedness of generated dialogue utterances and are particularly effective in combination.",
529
+ "We quantify performance using various evaluation methods---such as quantity of gendered words, a dialogue safety classifier, and human studies---all of which show that our models generate less gendered, but equally engaging chit-chat responses."
530
+ ]
531
+ },
532
+ {
533
+ "title": "Strategies for Structuring Story Generation",
534
+ "abstract": [
535
+ "Writers often rely on plans or sketches to write long stories, but most current language models generate word by word from left to right.",
536
+ "We explore coarse-to-fine models for creating narrative texts of several hundred words, and introduce new models which decompose stories by abstracting over actions and entities.",
537
+ "The model first generates the predicate-argument structure of the text, where different mentions of the same entity are marked with placeholder tokens.",
538
+ "It then generates a surface realization of the predicate-argument structure, and finally replaces the entity placeholders with context-sensitive names and references.",
539
+ "Human judges prefer the stories from our models to a wide range of previous approaches to hierarchical text generation.",
540
+ "Extensive analysis shows that our methods can help improve the diversity and coherence of events and entities in generated stories."
541
+ ]
542
+ },
543
+ {
544
+ "title": "Generating Interactive Worlds with Text",
545
+ "abstract": [
546
+ "Procedurally generating cohesive and interesting game environments is challenging and time-consuming.",
547
+ "In order for the relationships between the game elements to be natural, common-sense has to be encoded into arrangement of the elements.",
548
+ "In this work, we investigate a machine learning approach for world creation using content from the multi-player text adventure game environment LIGHT (Urbanek et al. 2019).",
549
+ "We introduce neural network based models to compositionally arrange locations, characters, and objects into a coherent whole.",
550
+ "In addition to creating worlds based on existing elements, our models can generate new game content.",
551
+ "Humans can also leverage our models to interactively aid in worldbuilding.",
552
+ "We show that the game environments created with our approach are cohesive, diverse, and preferred by human evaluators compared to other machine learning based world construction algorithms."
553
+ ]
554
+ },
555
+ {
556
+ "title": "Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs",
557
+ "abstract": [
558
+ "Query-based open-domain NLP tasks require information synthesis from long and diverse web results.",
559
+ "Current approaches extractively select portions of web text as input to Sequence-to-Sequence models using methods such as TF-IDF ranking.",
560
+ "We propose constructing a local graph structured knowledge base for each query, which compresses the web search information and reduces redundancy.",
561
+ "We show that by linearizing the graph into a structured input sequence, models can encode the graph representations within a standard Sequence-to-Sequence setting.",
562
+ "For two generative tasks with very long text input, long-form question answering and multi-document summarization, feeding graph representations as input can achieve better performance than using retrieved text portions."
563
+ ]
564
+ },
565
+ {
566
+ "title": "Assessing topic model relevance: Evaluation and informative priors",
567
+ "abstract": [
568
+ "Latent Dirichlet allocation (LDA) models trained without stopword removal often produce topics with high posterior probabilities on uninformative words, obscuring the underlying corpus content.",
569
+ "Even when canonical stopwords are manually removed, uninformative words common in that corpus will still dominate the most probable words in a topic.",
570
+ "In this work, we first show how the standard topic quality measures of coherence and pointwise mutual information act counter\u2010intuitively in the presence of common but irrelevant words, making it difficult to even quantitatively identify situations in which topics may be dominated by stopwords.",
571
+ "We propose an additional topic quality metric that targets the stopword problem, and show that it, unlike the standard measures, correctly correlates with human judgments of quality as defined by concentration of information\u2010rich words.",
572
+ "We also propose a simple\u2010to\u2010implement strategy for generating topics that are evaluated to be of much higher quality by both human assessment and our new metric.",
573
+ "This approach, a collection of informative priors easily introduced into most LDA\u2010style inference methods, automatically promotes terms with domain relevance and demotes domain\u2010specific stop words.",
574
+ "We demonstrate this approach's effectiveness in three very different domains: Department of Labor accident reports, online health forum posts, and NIPS abstracts.",
575
+ "Overall we find that current practices thought to solve this problem do not do so adequately, and that our proposal offers a substantial improvement for those interested in interpreting their topics as objects in their own right."
576
+ ]
577
+ },
578
+ {
579
+ "title": "GLOSS: Generative Latent Optimization of Sentence Representations",
580
+ "abstract": [
581
+ "We propose a method to learn unsupervised sentence representations in a non-compositional manner based on Generative Latent Optimization.",
582
+ "Our approach does not impose any assumptions on how words are to be combined into a sentence representation.",
583
+ "We discuss a simple Bag of Words model as well as a variant that models word positions.",
584
+ "Both are trained to reconstruct the sentence based on a latent code and our model can be used to generate text.",
585
+ "Experiments show large improvements over the related Paragraph Vectors.",
586
+ "Compared to uSIF, we achieve a relative improvement of 5% when trained on the same data and our method performs competitively to Sent2vec while trained on 30 times less data."
587
+ ]
588
+ },
589
+ {
590
+ "title": "ELI5: Long Form Question Answering",
591
+ "abstract": [
592
+ "We introduce the first large-scale corpus for long form question answering, a task requiring elaborate and in-depth answers to open-ended questions.",
593
+ "The dataset comprises 270K threads from the Reddit forum \u201cExplain Like I\u2019m Five\u201d (ELI5) where an online community provides answers to questions which are comprehensible by five year olds.",
594
+ "Compared to existing datasets, ELI5 comprises diverse questions requiring multi-sentence answers.",
595
+ "We provide a large set of web documents to help answer the question.",
596
+ "Automatic and human evaluations show that an abstractive model trained with a multi-task objective outperforms conventional Seq2Seq, language modeling, as well as a strong extractive baseline.",
597
+ "However, our best model is still far from human performance since raters prefer gold responses in over 86% of cases, leaving ample opportunity for future improvement."
598
+ ]
599
+ },
600
+ {
601
+ "title": "Reducing Transformer Depth on Demand with Structured Dropout",
602
+ "abstract": [
603
+ "Overparameterized transformer networks have obtained state of the art results in various natural language processing tasks, such as machine translation, language modeling, and question answering.",
604
+ "These models contain hundreds of millions of parameters, necessitating a large amount of computation and making them prone to overfitting.",
605
+ "In this work, we explore LayerDrop, a form of structured dropout, which has a regularization effect during training and allows for efficient pruning at inference time.",
606
+ "In particular, we show that it is possible to select sub-networks of any depth from one large network without having to finetune them and with limited impact on performance.",
607
+ "We demonstrate the effectiveness of our approach by improving the state of the art on machine translation, language modeling, summarization, question answering, and language understanding benchmarks.",
608
+ "Moreover, we show that our approach leads to small BERT-like models of higher quality compared to training from scratch or using distillation."
609
+ ]
610
+ },
611
+ {
612
+ "title": "Learning to Speak and Act in a Fantasy Text Adventure Game",
613
+ "abstract": [
614
+ "We introduce a large-scale crowdsourced text adventure game as a research platform for studying grounded dialogue.",
615
+ "In it, agents can perceive, emote, and act whilst conducting dialogue with other agents.",
616
+ "Models and humans can both act as characters within the game.",
617
+ "We describe the results of training state-of-the-art generative and retrieval models in this setting.",
618
+ "We show that in addition to using past dialogue, these models are able to effectively use the state of the underlying world to condition their predictions.",
619
+ "In particular, we show that grounding on the details of the local environment, including location descriptions, and the objects (and their affordances) and characters (and their previous actions) present within it allows better predictions of agent behavior and dialogue.",
620
+ "We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully."
621
+ ]
622
+ },
623
+ {
624
+ "title": "fairseq: A Fast, Extensible Toolkit for Sequence Modeling",
625
+ "abstract": [
626
+ "fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks.",
627
+ "The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines.",
628
+ "We also support fast mixed-precision training and inference on modern GPUs.",
629
+ "A demo video can be found at https://www.youtube.com/watch?v=OtgDdWtHvto"
630
+ ]
631
+ },
632
+ {
633
+ "title": "Pay Less Attention with Lightweight and Dynamic Convolutions",
634
+ "abstract": [
635
+ "Self-attention is a useful mechanism to build generative models for language and images.",
636
+ "It determines the importance of context elements by comparing each element to the current time step.",
637
+ "In this paper, we show that a very lightweight convolution can perform competitively to the best reported self-attention results.",
638
+ "Next, we introduce dynamic convolutions which are simpler and more efficient than self-attention.",
639
+ "We predict separate convolution kernels based solely on the current time-step in order to determine the importance of context elements.",
640
+ "The number of operations required by this approach scales linearly in the input length, whereas self-attention is quadratic.",
641
+ "Experiments on large-scale machine translation, language modeling and abstractive summarization show that dynamic convolutions improve over strong self-attention models.",
642
+ "On the WMT'14 English-German test set dynamic convolutions achieve a new state of the art of 29.7 BLEU."
643
+ ]
644
+ },
645
+ {
646
+ "title": "Wizard of Wikipedia: Knowledge-Powered Conversational agents",
647
+ "abstract": [
648
+ "In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date.",
649
+ "The most popular sequence to sequence models typically \"generate and hope\" generic utterances that can be memorized in the weights of the model when mapping from input utterance(s) to output, rather than employing recalled knowledge as context.",
650
+ "Use of knowledge has so far proved difficult, in part because of the lack of a supervised learning benchmark task which exhibits knowledgeable open dialogue with clear grounding.",
651
+ "To that end we collect and release a large dataset with conversations directly grounded with knowledge retrieved from Wikipedia.",
652
+ "We then design architectures capable of retrieving knowledge, reading and conditioning on it, and finally generating natural responses.",
653
+ "Our best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while our new benchmark allows for measuring further improvements in this important research direction."
654
+ ]
655
+ },
656
+ {
657
+ "title": "Hierarchical Neural Story Generation",
658
+ "abstract": [
659
+ "We explore story generation: creative systems that can build coherent and fluent passages of text about a topic.",
660
+ "We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum.",
661
+ "Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text.",
662
+ "We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context.",
663
+ "Experiments show large improvements over strong baselines on both automated and human evaluations.",
664
+ "Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one."
665
+ ]
666
+ },
667
+ {
668
+ "title": "Prior matters: simple and general methods for evaluating and improving topic quality in topic modeling",
669
+ "abstract": [
670
+ "Latent Dirichlet Allocation (LDA) models trained without stopword removal often produce topics with high posterior probabilities on uninformative words, obscuring the underlying corpus content.",
671
+ "Even when canonical stopwords are manually removed, uninformative words common in that corpus will still dominate the most probable words in a topic.",
672
+ "In this work, we first show how the standard topic quality measures of coherence and pointwise mutual information act counter-intuitively in the presence of common but irrelevant words, making it difficult to even quantitatively identify situations in which topics may be dominated by stopwords.",
673
+ "We propose an additional topic quality metric that targets the stopword problem, and show that it, unlike the standard measures, correctly correlates with human judgements of quality.",
674
+ "We also propose a simple-to-implement strategy for generating topics that are evaluated to be of much higher quality by both human assessment and our new metric.",
675
+ "This approach, a collection of informative priors easily introduced into most LDA-style inference methods, automatically promotes terms with domain relevance and demotes domain-specific stop words.",
676
+ "We demonstrate this approach's effectiveness in three very different domains: Department of Labor accident reports, online health forum posts, and NIPS abstracts.",
677
+ "Overall we find that current practices thought to solve this problem do not do so adequately, and that our proposal offers a substantial improvement for those interested in interpreting their topics as objects in their own right."
678
+ ]
679
+ },
680
+ {
681
+ "title": "Promoting Domain-Specific Terms in Topic Models with Informative Priors",
682
+ "abstract": [
683
+ "Latent Dirichlet Allocation (LDA) models trained without stopword removal often produce topics with high posterior probabilities on uninformative words, obscuring the underlying corpus content.",
684
+ "Even when canonical stopwords are manually removed, uninformative words common in that corpus will still dominate the most probable words in a topic.",
685
+ "We propose a simple strategy for automatically promoting terms with domain relevance and demoting these domain-specific stop words.",
686
+ "Our approach is easily applied within any existing LDA framework and increases the amount of domain-relevant content and reduces the appearance of canonical and humanevaluated stopwords in three very different domains: Department of Labor accident reports, online health forum posts, and NIPS abstracts.",
687
+ "Along the way, we show that standard topic quality measures such as coherence and pointwise mutual information act counter-intuitively in presence of common but irrelevant words.",
688
+ "We also explain why these standard metrics fall short, propose an additional topic quality metric that targets the stopword problem, and show that it correlates with our human subject experiments."
689
+ ]
690
+ },
691
+ {
692
+ "title": "Controllable Abstractive Summarization",
693
+ "abstract": [
694
+ "Current models for document summarization disregard user preferences such as the desired length, style, the entities that the user might be interested in, or how much of the document the user has already read.",
695
+ "We present a neural summarization model with a simple but effective mechanism to enable users to specify these high level attributes in order to control the shape of the final summaries to better suit their needs.",
696
+ "With user input, our system can produce high quality summaries that follow user preferences.",
697
+ "Without user input, we set the control variables automatically \u2013 on the full text CNN-Dailymail dataset, we outperform state of the art abstractive systems (both in terms of F1-ROUGE1 40.38 vs. 39.53 F1-ROUGE and human evaluation."
698
+ ]
699
+ },
700
+ {
701
+ "title": "Language Modeling with Gated Convolutional Networks",
702
+ "abstract": [
703
+ "The pre-dominant approach to language modeling to date is based on recurrent neural networks.",
704
+ "Their success on this task is often linked to their ability to capture unbounded context.",
705
+ "In this paper we develop a finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens.",
706
+ "We propose a novel simplified gating mechanism that outperforms Oord et al (2016) and investigate the impact of key architectural decisions.",
707
+ "The proposed approach achieves state-of-the-art on the WikiText-103 benchmark, even though it features long-term dependencies, as well as competitive results on the Google Billion Words benchmark.",
708
+ "Our model reduces the latency to score a sentence by an order of magnitude compared to a recurrent baseline.",
709
+ "To our knowledge, this is the first time a non-recurrent approach is competitive with strong recurrent models on these large scale language tasks."
710
+ ]
711
+ }
712
+ ],
713
+ "user_kps": [
714
+ "attention-based convolutional neural network",
715
+ "automatic question generation",
716
+ "conversational agents",
717
+ "cross-lingual transfer",
718
+ "dialogue model",
719
+ "explanation system",
720
+ "latent dirichlet allocation model",
721
+ "machine translation evaluation",
722
+ "memory networks",
723
+ "model compression",
724
+ "multi-engine machine translation",
725
+ "multilingual neural machine translation",
726
+ "neural machine translation model",
727
+ "neural sequence-to-sequence models",
728
+ "private information retrieval",
729
+ "quantized neural networks",
730
+ "sentence encoders",
731
+ "story generation",
732
+ "topic modeling",
733
+ "unsupervised neural machine translation"
734
+ ]
735
+ }
data/users/agloberson/embeds-agloberson-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1a726fbe60d92a1b19be389d42d03bbb5189ffeb009b8e5ce1266ba9703787f4
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6508af8c35642985a8c9c2eafe2fa61f1e2affc8ffceac3999657d1729bb20ce
3
+ size 848000
data/users/agloberson/embeds-agloberson-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1d3e16fdc4cf0e9b266fbdf434a24e78c90a7bc9ee84fd134cf2c6aa99c05c14
3
- size 510946
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c062d99d27e800281baa28d1e799287a5992c31d179a456059fe2843d16a502
3
+ size 3296230
data/users/agloberson/pid2idx-agloberson-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49, "50": 50, "51": 51, "52": 52, "53": 53, "54": 54, "55": 55, "56": 56, "57": 57, "58": 58, "59": 59, "60": 60, "61": 61, "62": 62, "63": 63, "64": 64, "65": 65, "66": 66, "67": 67, "68": 68, "69": 69, "70": 70, "71": 71, "72": 72, "73": 73, "74": 74, "75": 75, "76": 76, "77": 77, "78": 78, "79": 79, "80": 80, "81": 81, "82": 82, "83": 83, "84": 84, "85": 85, "86": 86, "87": 87, "88": 88, "89": 89, "90": 90, "91": 91, "92": 92, "93": 93, "94": 94, "95": 95, "96": 96, "97": 97, "98": 98, "99": 99, "100": 100, "101": 101, "102": 102, "103": 103, "104": 104, "105": 105, "106": 106, "107": 107, "108": 108, "109": 109, "110": 110, "111": 111, "112": 112, "113": 113, "114": 114, "115": 115, "116": 116, "117": 117, "118": 118, "119": 119, "120": 120, "121": 121, "122": 122, "123": 123, "124": 124, "125": 125, "126": 126, "127": 127, "128": 128, "129": 129, "130": 130, "131": 131, "132": 132, "133": 133, "134": 134, "135": 135, "136": 136, "137": 137}
data/users/agloberson/seedset-agloberson-maple.json CHANGED
The diff for this file is too large to render. See raw diff
 
data/users/dbelgrave/embeds-dbelgrave-doc.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:797fe334ef435c5c0bfc025bbcc7e388d1974f9668e80ec3a427fd079e0a12ad
3
+ size 258176
data/users/dbelgrave/embeds-dbelgrave-sent.pickle ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7bb470fa2c1f1211548b9e3eb218b26a34eef242f9ce5bfa5217653e50167b0a
3
+ size 812860
data/users/dbelgrave/pid2idx-dbelgrave-doc.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41}
data/users/dbelgrave/seedset-dbelgrave-maple.json ADDED
@@ -0,0 +1,527 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "dbelgrave",
3
+ "s2_authorid": "145763736",
4
+ "papers": [
5
+ {
6
+ "title": "Generative models improve fairness of medical classifiers under distribution shifts",
7
+ "abstract": [
8
+ "A ubiquitous challenge in machine learning is the problem of domain generalisation.",
9
+ "This can exacerbate bias against groups or labels that are underrepresented in the datasets used for model development.",
10
+ "Model bias can lead to unintended harms, especially in safety-critical applications like healthcare.",
11
+ "Furthermore, the challenge is compounded by the difficulty of obtaining labelled data due to high cost or lack of readily available domain expertise.",
12
+ "In our work, we show that learning realistic augmentations automatically from data is possible in a label-efficient manner using generative models.",
13
+ "In particular, we leverage the higher abundance of unlabelled data to capture the underlying data distribution of different conditions and subgroups for an imaging modality.",
14
+ "By conditioning generative models on appropriate labels, we can steer the distribution of synthetic examples according to specific requirements.",
15
+ "We demonstrate that these learned augmentations can surpass heuristic ones by making models more robust and statistically fair in- and out-of-distribution.",
16
+ "To evaluate the generality of our approach, we study 3 distinct medical imaging contexts of varying difficulty: (i) histopathology images from a publicly available generalisation benchmark, (ii) chest X-rays from publicly available clinical datasets, and (iii) dermatology images characterised by complex shifts and imaging conditions.",
17
+ "Complementing real training samples with synthetic ones improves the robustness of models in all three medical tasks and increases fairness by improving the accuracy of diagnosis within underrepresented groups.",
18
+ "This approach leads to stark improvements OOD across modalities: 7.7% prediction accuracy improvement in histopathology, 5.2% in chest radiology with 44.6% lower fairness gap and a striking 63.5% improvement in high-risk sensitivity for dermatology with a 7.5x reduction in fairness gap."
19
+ ]
20
+ },
21
+ {
22
+ "title": "Active Acquisition for Multimodal Temporal Data: A Challenging Decision-Making Task",
23
+ "abstract": [
24
+ "We introduce a challenging decision-making task that we call active acquisition for multimodal temporal data (A2MT).",
25
+ "In many real-world scenarios, input features are not readily available at test time and must instead be acquired at significant cost.",
26
+ "With A2MT, we aim to learn agents that actively select which modalities of an input to acquire, trading off acquisition cost and predictive performance.",
27
+ "A2MT extends a previous task called active feature acquisition to temporal decision making about high-dimensional inputs.",
28
+ "We propose a method based on the Perceiver IO architecture to address A2MT in practice.",
29
+ "Our agents are able to solve a novel synthetic scenario requiring practically relevant cross-modal reasoning skills.",
30
+ "On two large-scale, real-world datasets, Kinetics-700 and AudioSet, our agents successfully learn cost-reactive acquisition behavior.",
31
+ "However, an ablation reveals they are unable to learn adaptive acquisition strategies, emphasizing the difficulty of the task even for state-of-the-art models.",
32
+ "Applications of A2MT may be impactful in domains like medicine, robotics, or finance, where modalities differ in acquisition cost and informativeness."
33
+ ]
34
+ },
35
+ {
36
+ "title": "The promise of machine learning in predicting treatment outcomes in psychiatry",
37
+ "abstract": [
38
+ "For many years, psychiatrists have tried to understand factors involved in response to medications or psychotherapies, in order to personalize their treatment choices.",
39
+ "There is now a broad and growing interest in the idea that we can develop models to personalize treatment decisions using new statistical approaches from the field of machine learning and applying them to larger volumes of data.",
40
+ "In this pursuit, there has been a paradigm shift away from experimental studies to confirm or refute specific hypotheses towards a focus on the overall explanatory power of a predictive model when tested on new, unseen datasets.",
41
+ "In this paper, we review key studies using machine learning to predict treatment outcomes in psychiatry, ranging from medications and psychotherapies to digital interventions and neurobiological treatments.",
42
+ "Next, we focus on some new sources of data that are being used for the development of predictive models based on machine learning, such as electronic health records, smartphone and social media data, and on the potential utility of data from genetics, electrophysiology, neuroimaging and cognitive testing.",
43
+ "Finally, we discuss how far the field has come towards implementing prediction tools in real\u2010world clinical practice.",
44
+ "Relatively few retrospective studies to\u2010date include appropriate external validation procedures, and there are even fewer prospective studies testing the clinical feasibility and effectiveness of predictive models.",
45
+ "Applications of machine learning in psychiatry face some of the same ethical challenges posed by these techniques in other areas of medicine or computer science, which we discuss here.",
46
+ "In short, machine learning is a nascent but important approach to improve the effectiveness of mental health care, and several prospective clinical studies suggest that it may be working already."
47
+ ]
48
+ },
49
+ {
50
+ "title": "Understanding Client Support Strategies to Improve Clinical Outcomes in an Online Mental Health Intervention",
51
+ "abstract": [
52
+ "Online mental health interventions are increasingly important in providing access to, and supporting the effectiveness of, mental health treatment.",
53
+ "While these technologies are effective, user attrition and early disengagement are key challenges.",
54
+ "Evidence suggests that integrating a human supporter into such services mitigates these challenges, however, it remains under-studied how supporter involvement benefits client outcomes, and how to maximize such effects.",
55
+ "We present our analysis of 234,735 supporter messages to discover how different support strategies correlate with clinical outcomes.",
56
+ "We describe our machine learning methods for: (i) clustering supporters based on client outcomes; (ii) extracting and analyzing linguistic features from supporter messages; and (iii) identifying context-specific patterns of support.",
57
+ "Our findings indicate that concrete, positive and supportive feedback from supporters that reference social behaviors are strongly associated with better outcomes; and show how their importance varies dependent on different client situations.",
58
+ "We discuss design implications for personalized support and supporter interfaces."
59
+ ]
60
+ },
61
+ {
62
+ "title": "Hide-and-Seek Privacy Challenge",
63
+ "abstract": [
64
+ "The clinical time-series setting poses a unique combination of challenges to data modeling and sharing.",
65
+ "Due to the high dimensionality of clinical time series, adequate de-identification to preserve privacy while retaining data utility is difficult to achieve using common de-identification techniques.",
66
+ "An innovative approach to this problem is synthetic data generation.",
67
+ "From a technical perspective, a good generative model for time-series data should preserve temporal dynamics, in the sense that new sequences respect the original relationships between high-dimensional variables across time.",
68
+ "From the privacy perspective, the model should prevent patient re-identification by limiting vulnerability to membership inference attacks.",
69
+ "The NeurIPS 2020 Hide-and-Seek Privacy Challenge is a novel two-tracked competition to simultaneously accelerate progress in tackling both problems.",
70
+ "In our head-to-head format, participants in the synthetic data generation track (i.e. \"hiders\") and the patient re-identification track (i.e. \"seekers\") are directly pitted against each other by way of a new, high-quality intensive care time-series dataset: the AmsterdamUMCdb dataset.",
71
+ "Ultimately, we seek to advance generative techniques for dense and high-dimensional temporal data streams that are (1) clinically meaningful in terms of fidelity and predictivity, as well as (2) capable of minimizing membership privacy risks in terms of the concrete notion of patient re-identification."
72
+ ]
73
+ },
74
+ {
75
+ "title": "Causal Discovery for Causal Bandits utilizing Separating Sets",
76
+ "abstract": [
77
+ "The Causal Bandit is a variant of the classic Bandit problem where an agent must identify the best action in a sequential decision-making process, where the reward distribution of the actions displays a non-trivial dependence structure that is governed by a causal model.",
78
+ "All methods proposed thus far in the literature rely on exact prior knowledge of the causal model to obtain improved estimators for the reward.",
79
+ "We formulate a new causal bandit algorithm that is the first to no longer rely on explicit prior causal knowledge and instead uses the output of causal discovery algorithms.",
80
+ "This algorithm relies on a new estimator based on separating sets, a causal structure already known in causal discovery literature.",
81
+ "We show that given a separating set, this estimator is unbiased, and has lower variance compared to the sample mean.",
82
+ "We derive a concentration bound and construct a UCB-type algorithm based on this bound, as well as a Thompson sampling variant.",
83
+ "We compare our algorithms with traditional bandit algorithms on simulation data.",
84
+ "On these problems, our algorithms show a significant boost in performance."
85
+ ]
86
+ },
87
+ {
88
+ "title": "Machine Learning in Mental Health",
89
+ "abstract": [
90
+ "High prevalence of mental illness and the need for effective mental health care, combined with recent advances in AI, has led to an increase in explorations of how the field of machine learning (ML) can assist in the detection, diagnosis and treatment of mental health problems.",
91
+ "ML techniques can potentially offer new routes for learning patterns of human behavior; identifying mental health symptoms and risk factors; developing predictions about disease progression; and personalizing and optimizing therapies.",
92
+ "Despite the potential opportunities for using ML within mental health, this is an emerging research area, and the development of effective ML-enabled applications that are implementable in practice is bound up with an array of complex, interwoven challenges.",
93
+ "Aiming to guide future research and identify new directions for advancing development in this important domain, this article presents an introduction to, and a systematic review of, current ML work regarding psycho-socially based mental health conditions from the computing and HCI literature.",
94
+ "A quantitative synthesis and qualitative narrative review of 54 papers that were included in the analysis surfaced common trends, gaps, and challenges in this space.",
95
+ "Discussing our findings, we (i) reflect on the current state-of-the-art of ML work for mental health, (ii) provide concrete suggestions for a stronger integration of human-centered and multi-disciplinary approaches in research and development, and (iii) invite more consideration of the potentially far-reaching personal, social, and ethical implications that ML models and interventions can have, if they are to find widespread, successful adoption in real-world mental health contexts."
96
+ ]
97
+ },
98
+ {
99
+ "title": "Hide-and-Seek Privacy Challenge: Synthetic Data Generation vs. Patient Re-identification",
100
+ "abstract": [
101
+ "The clinical time-series setting poses a unique combination of challenges to data modelling and sharing.",
102
+ "Due to the high dimensionality of clinical time series, adequate deidentification to preserve privacy while retaining data utility is difficult to achieve using common de-identification techniques.",
103
+ "An innovative approach to this problem is synthetic data generation.",
104
+ "From a technical perspective, a good generative model for time-series data should preserve temporal dynamics; new sequences should respect the original relationships between high-dimensional variables across time.",
105
+ "From the privacy perspective, the model should prevent patient re-identification.",
106
+ "The NeurIPS 2020 Hide-and-Seek Privacy Challenge was a novel two-tracked competition to simultaneously accelerate progress in tackling both problems.",
107
+ "In our head-to-head format, participants in the generation track (\u201chiders\u201d) and the patient re-identification track (\u201cseekers\u201d) were directly pitted against each other by way of a new, high-quality intensive care time-series dataset: the Amsterda-"
108
+ ]
109
+ },
110
+ {
111
+ "title": "A Machine Learning Approach to Understanding Patterns of Engagement With Internet-Delivered Mental Health Interventions",
112
+ "abstract": [
113
+ "Key Points Question Can machine learning techniques be used to identify heterogeneity in patient engagement with internet-based cognitive behavioral therapy for symptoms of depression and anxiety?",
114
+ "Findings In this cohort study using data from 54\u2009604 individuals, 5 heterogeneous subtypes were identified based on patient engagement with the online intervention.",
115
+ "These subtypes were associated with different patterns of patient behavior and different levels of improvement in symptoms of depression and anxiety.",
116
+ "Meaning The findings of this study suggest that patterns of patient behavior may elucidate different modalities of engagement, which can help to conduct better triage for patients to provide personalized therapeutic activities, helping to improve outcomes and reduce the overall burden of mental health disorders."
117
+ ]
118
+ },
119
+ {
120
+ "title": "Causal Bandits without prior knowledge using separating sets",
121
+ "abstract": [
122
+ "The Causal Bandit is a variant of the classic Bandit problem where an agent must identify the best action in a sequential decision-making process, where the reward distribution of the actions displays a non-trivial dependence structure that is governed by a causal model.",
123
+ "Methods proposed for this problem thus far in the literature rely on exact prior knowledge of the full causal graph.",
124
+ "We formulate new causal bandit algorithms that no longer necessarily rely on prior causal knowledge.",
125
+ "Instead, they utilize an estimator based on separating sets, which we can find using simple conditional independence tests or causal discovery methods.",
126
+ "We show that, given a true separating set, for discrete i.i.d.",
127
+ "data, this estimator is unbiased, and has variance which is upper bounded by that of the sample mean.",
128
+ "We develop algorithms based on Thompson Sampling and UCB for discrete and Gaussian models respectively and show increased performance on simulation data as well as on a bandit drawing from real-world protein signaling data."
129
+ ]
130
+ },
131
+ {
132
+ "title": "Differential associations of allergic disease genetic variants with developmental profiles of eczema, wheeze and rhinitis",
133
+ "abstract": [
134
+ "Allergic diseases (eczema, wheeze and rhinitis) in children often present as heterogeneous phenotypes.",
135
+ "Understanding genetic associations of specific patterns of symptoms might facilitate understanding of the underlying biological mechanisms."
136
+ ]
137
+ },
138
+ {
139
+ "title": "Individual risk assessment tool for school\u2010age asthma prediction in UK birth cohort",
140
+ "abstract": [
141
+ "Current published asthma predictive tools have moderate positive likelihood ratios (+LR) but high negative likelihood ratios (\u2212LR) based on their recommended cut\u2010offs, which limit their clinical usefulness."
142
+ ]
143
+ },
144
+ {
145
+ "title": "Trajectories of childhood immune development and respiratory health relevant to asthma and allergy",
146
+ "abstract": [
147
+ "Events in early life contribute to subsequent risk of asthma; however, the causes and trajectories of childhood wheeze are heterogeneous and do not always result in asthma.",
148
+ "Similarly, not all atopic individuals develop wheeze, and vice versa.",
149
+ "The reasons for these differences are unclear.",
150
+ "Using unsupervised model-based cluster analysis, we identified latent clusters within a prospective birth cohort with deep immunological and respiratory phenotyping.",
151
+ "We characterised each cluster in terms of immunological profile and disease risk, and replicated our results in external cohorts from the UK and USA.",
152
+ "We discovered three distinct trajectories, one of which is a high-risk \u2018atopic\u2019 cluster with increased propensity for allergic diseases throughout childhood.",
153
+ "Atopy contributes varyingly to later wheeze depending on cluster membership.",
154
+ "Our findings demonstrate the utility of unsupervised analysis in elucidating heterogeneity in asthma pathogenesis and provide a foundation for improving management and prevention of childhood asthma."
155
+ ]
156
+ },
157
+ {
158
+ "title": "Cytokine Responses to Rhinovirus and Development of Asthma, Allergic Sensitization, and Respiratory Infections during Childhood",
159
+ "abstract": [
160
+ "Rationale: Immunophenotypes of antiviral responses, and their relationship with asthma, allergy, and lower respiratory tract infections, are poorly understood.",
161
+ "Objectives: We characterized multiple cytokine responses of peripheral blood mononuclear cells to rhinovirus stimulation, and their relationship with clinical outcomes.",
162
+ "Methods: In a population\u2010based birth cohort, we measured 28 cytokines after stimulation with rhinovirus\u201016 in 307 children aged 11 years.",
163
+ "We used machine learning to identify patterns of cytokine responses, and related these patterns to clinical outcomes, using longitudinal models.",
164
+ "We also ascertained phytohemagglutinin\u2010induced T\u2010helper cell type 2 (Th2)\u2010cytokine responses (PHA\u2010Th2).",
165
+ "Measurements and Main Results: We identified six clusters of children based on their rhinovirus\u201016 responses, which were differentiated by the expression of four cytokine/chemokine groups: interferon\u2010related (IFN), proinflammatory (Inflam), Th2\u2010chemokine (Th2\u2010chem), and regulatory (Reg).",
166
+ "Clusters differed in their clinical characteristics.",
167
+ "Children with an IFNmodInflamhighestTh2\u2010chemhighestReghighest rhinovirus\u201016\u2010induced pattern had a PHA\u2010Th2low response, and a very low asthma risk (odds ratio [OR], 0.08; 95% confidence interval [CI], 0.01\u20100.81; P = 0.03).",
168
+ "Two clusters had a high risk of asthma and allergic sensitization, but with different trajectories from infancy to adolescence.",
169
+ "The IFNlowestInflamhighTh2\u2010chemlowRegmod cluster exhibited a PHA\u2010Th2lowest response and was associated with early\u2010onset asthma and sensitization, and the highest risk of asthma exacerbations (OR, 1.37; 95% CI, 1.07\u20101.76; P = 0.014) and lower respiratory tract infection hospitalizations (OR, 2.40; 95% CI, 1.26\u20104.58; P = 0.008) throughout childhood.",
170
+ "In contrast, the IFNhighestInflammodTh2\u2010chemmodReghigh cluster with a rhinovirus\u201016\u2010cytokine pattern was characterized by a PHA\u2010Th2highest response, and a low prevalence of asthma/sensitization in infancy that increased sharply to become the highest among all clusters by adolescence (but with a low risk of asthma exacerbations).",
171
+ "Conclusions: Early\u2010onset troublesome asthma with early\u2010life sensitization, later\u2010onset milder allergic asthma, and disease protection are each associated with different patterns of rhinovirus\u2010induced immune responses."
172
+ ]
173
+ },
174
+ {
175
+ "title": "Asthma phenotypes in childhood",
176
+ "abstract": [
177
+ "ABSTRACT Introduction: Asthma is no longer thought of as a single disease, but rather a collection of varying symptoms expressing different disease patterns.",
178
+ "One of the ongoing challenges is understanding the underlying pathophysiological mechanisms that may be responsible for the varying responses to treatment.",
179
+ "Areas Covered: This review provides an overview of our current understanding of the asthma phenotype concept in childhood and describes key findings from both conventional and data-driven methods.",
180
+ "Expert Commentary: With the vast amounts of data generated from cohorts, there is hope that we can elucidate distinct pathophysiological mechanisms, or endotypes.",
181
+ "In return, this would lead to better patient stratification and disease management, thereby providing true personalised medicine."
182
+ ]
183
+ },
184
+ {
185
+ "title": "Predictive Modelling Strategies to Understand Heterogeneous Manifestations of Asthma in Early Life",
186
+ "abstract": [
187
+ "Wheezing is common among children and \u223c50% of those under 6 years of age are thought to experience at least one episode of wheeze.",
188
+ "However, due to the heterogeneity of symptoms there are difficulties in treating and diagnosing these children. \u2018",
189
+ "Phenotype specific therapy\u2019 is one possible avenue of treatment, whereby we use significant pathology and physiology to identify and treat pre-schoolers with wheeze.",
190
+ "By performing feature selection algorithms and predictive modelling techniques, this study will attempt to determine if it is possible to robustly distinguish patient diagnostic categories among pre-school children.",
191
+ "Univariate feature analysis identified more objective variables and recursive feature elimination a larger number of subjective variables as important in distinguishing between patient categories.",
192
+ "Predicative modelling saw a drop in performance when subjective variables were removed from analysis, indicating that these variables are important in distinguishing wheeze classes.",
193
+ "We achieved 90%+ performance in AUC, sensitivity, specificity, and accuracy, and 80%+ in kappa statistic, in distinguishing ill from healthy patients.",
194
+ "Developed in a synergistic statistical - machine learning approach, our methodologies propose also a novel ROC Cross Evaluation method for model post-processing and evaluation.",
195
+ "Our predictive modelling's stability was assessed in computationally intensive Monte Carlo simulations."
196
+ ]
197
+ },
198
+ {
199
+ "title": "Features of asthma which provide meaningful insights for understanding the disease heterogeneity",
200
+ "abstract": [
201
+ "Data\u2010driven methods such as hierarchical clustering (HC) and principal component analysis (PCA) have been used to identify asthma subtypes, with inconsistent results."
202
+ ]
203
+ },
204
+ {
205
+ "title": "Non-parametric mixture models identify trajectories of childhood immune development relevant to asthma and allergy",
206
+ "abstract": [
207
+ "Events in early life contribute to subsequent risk of asthma; however, the causes and trajectories of childhood wheeze are heterogeneous and do not always result in asthma.",
208
+ "Similarly, not all atopic individuals develop wheeze, and vice versa.",
209
+ "The reasons for these differences are unclear.",
210
+ "Using unsupervised model-based cluster analysis, we identified latent clusters within a prospective birth cohort with deep immunological and respiratory phenotyping.",
211
+ "We characterised each cluster in terms of immunological profile and disease risk, and replicated our results in external cohorts from the UK and USA.",
212
+ "We discovered three distinct trajectories, one of which is a high-risk \u201catopic\u201d cluster with increased propensity for allergic diseases throughout childhood.",
213
+ "Atopy contributes varyingly to later wheeze depending on cluster membership.",
214
+ "Our findings demonstrate the utility of unsupervised analysis in elucidating heterogeneity in asthma pathogenesis and provide a foundation for improving management and prevention of childhood asthma."
215
+ ]
216
+ },
217
+ {
218
+ "title": "The importance of being earnest in epidemiology",
219
+ "abstract": [
220
+ "Big data sets and novel analytical methods present us with an opportunity and a challenge to push the boundaries of medical research to move towards more targeted, personalised management strategies (1).",
221
+ "The concept of \u2018big data\u2019 and the promises made concerning the potential of big data to generate novel hypotheses and provide most of the solutions in health care is becoming everyday parlance, with increasing belief that the data must and will speak for themselves.",
222
+ "In childhood allergic diseases, discovering latent structure and patterns within large data sets using unbiased machine learning techniques has been used to identify distinct subgroups (classes or clusters) under the umbrella diagnoses of asthma, eczema and rhinitis (2,3).",
223
+ "The study by Goks\u20ac or et al. (4) in the current issue of the journal is a timely reminder that in addition to disaggregating the structure in the big data sets using \u2018unbiased\u2019 analytical techniques, we need to look at the detailed associations of the temporality and comorbidity of symptoms via more traditional epidemiological approaches to fully understand the development of allergic diseases.",
224
+ "This study from the Swedish longitudinal cohort presents a valuable epidemiological analysis which may contribute to a better understanding of the development of allergic symptoms during childhood.",
225
+ "With the advent of the computational revolution combined with the amplified scale of biological, genetic and phenotypic healthcare data which has become available, the horizons of a data-driven hypothesis-generating approach to understanding disease have somewhat overshadowed the more traditional epidemiological hypothesestesting approach based on carefully constructed scientific questions and observations.",
226
+ "The ever-increasing quantity of data which is generated has made it impossible at times to know what we are looking for, and what questions need to be asked.",
227
+ "One way of tackling this has been by taking an agnostic approach towards understanding the structure of the data by using unsupervised machine learning algorithms.",
228
+ "Machine learning searches through data to look for patterns, combining mathematical modelling for analysing high-dimensional data and computational statistics to identify model-based patterns or structure within the data.",
229
+ "However, nothing replaces and indeed nothing can replace carefully constructed epidemiological studies which ask specific questions and interrogate data sets to test specific hypotheses based on scientific intuition and prior clinical knowledge, and which act as a sanity check to validate the findings of hypothesis-generating studies.",
230
+ "In his classical textbook, Rothman describes epidemiology as the study of the distribution and determinants of disease, as well as its frequency and occurrence (5).",
231
+ "Epidemiology is concerned with understanding associations and risk factors of diseases, by using carefully constructed frameworks to understand disease causality (5).",
232
+ "These concepts, which seem so simple on the surface, are fundamental if we are to extract meaningful clinical interpretations of the vast amount of medical data with which we are constantly presented.",
233
+ "Goks\u20ac or et al. (4) use this epidemiological approach, presenting a carefully structured question to investigate whether allergic manifestations in early life are associated with an increased risk in allergic manifestations in later life.",
234
+ "The way in which they go about answering this question is likewise carefully constructed, looking at a range of symptoms and definitions of eczema, asthma, allergic rhinitis and food allergy.",
235
+ "This study takes us back to the fundamental principles of epidemiology, which is understanding causality in order to influence policy.",
236
+ "This was the aim of the \u2018first epidemiologist\u2019, Sir John Snow, who established the cause of the 1854 cholera epidemic in London was contaminated water, rather than bacteria from the surrounding area, which led to the closing down of the water pumps in Soho.",
237
+ "We need to go back to the basics of epidemiology so that we do not miss the opportunity to capitalise on what epidemiology is all about: understanding causality and distinguishing causality from confounding.",
238
+ "Confounding, an important albeit too often overlooked concept in epidemiology, is the study of effects which may explain away apparently causal associations observed in a statistical model.",
239
+ "Confounding is a complex scenario and the existence of latent or unobserved confounders is an area which is acknowledged, but in which much development needs to be encouraged on the more practical level (6).",
240
+ "The thorough analysis by Goks\u20ac or et al. (4) has made an attempt to disaggregate all possible explanations of the development of, and the relationship between allergic manifestations during childhood, through carefully addressing confounding.",
241
+ "The results challenge the view that development of allergic manifestations in childhood is a progressive development of one disease.",
242
+ "Traditionally, the term \u2018atopic march\u2019 has been used to describe the progression of allergic disease from eczema to asthma and allergic rhinitis during childhood (7), and under this conceptual"
243
+ ]
244
+ },
245
+ {
246
+ "title": "Age, sex and the association between skin test responses and IgE titres with asthma",
247
+ "abstract": [
248
+ "Skin prick tests (SPTs) and allergen\u2010specific serum IgE (sIgE) measurements are the main diagnostic tools for confirming atopic sensitization.",
249
+ "Results are usually reported as \u2018positive\u2019 or \u2018negative\u2019, using the same arbitrary cut\u2010offs (SPT>3 mm, sIgE>0.35 kUA/l) across different ages and sexes.",
250
+ "We investigated the influence of age and sex on the interpretation of allergy test in the context of childhood asthma."
251
+ ]
252
+ },
253
+ {
254
+ "title": "692 693 694 695 696 697 698 699 700 701 702 703 704 705 706",
255
+ "abstract": [
256
+ "Distinguishing benign from pathologic Th2-immunity in atopic children 1 2 Patrick G Holt DSc FAA, Deborah Strickland PhD, Anthony Bosco PhD, Danielle Belgrave MD 3 PhD, Belinda Hales BSc (Hons), Angela Simpson MD PhD, Elysia Hollams PhD, Barbara Holt 4 BSc, Merci Kusel PhD, Staffan Ahlstedt PhD, Peter D Sly FRCP DSc* 5 and Adnan Custovic MD PhD* 6 7 Telethon Kids Institute, University of Western Australia, Perth, Western Australia 8 Queensland Children\u2019s Medical Research Institute, The University of Queensland, Brisbane, 9 Australia 10 Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of 11 Manchester and University Hospital of South Manchester, Manchester, UK 12 Centre for Allergy Research, Karolinska Institute, Stockholm, Sweden 13 *Equal contribution 14 15 16 Correspondence and requests for reprints: 17 Prof Patrick G Holt, Division of Cell Biology, Telethon Kids Institute, PO Box 855, West Perth, WA 18 6872, Australia; Tel: +61 8 9489 7838 19 Email: patrick.holt@telethonkids.org.au; marina.stubbs@telethonkids.org.au 20 21 22 23 Word count: 3589 ; Abstract word count: 250 24 Marina Stubbs 6/8/2015 2:43 PM Formatted: Width: 21 cm, Height: 29.7 cm, Numbering: Continuous"
257
+ ]
258
+ },
259
+ {
260
+ "title": "Trajectories of lung function during childhood.",
261
+ "abstract": [
262
+ "RATIONALE\nDevelopmental patterns of lung function during childhood may have major implications for our understanding of the pathogenesis of respiratory disease throughout life.",
263
+ "\n\n\nOBJECTIVES\nTo explore longitudinal trajectories of lung function during childhood and factors associated with lung function decline.",
264
+ "\n\n\nMETHODS\nIn a population-based birth cohort, specific airway resistance (sRaw) was assessed at age 3 (n = 560), 5 (n = 829), 8 (n = 786), and 11 years (n = 644).",
265
+ "Based on prospective data (questionnaires, skin tests, IgE), children were assigned to wheeze phenotypes (no wheezing, transient, late-onset, and persistent) and atopy phenotypes (no atopy, dust mite, non-dust mite, multiple early, and multiple late).",
266
+ "We used longitudinal linear mixed models to determine predictors of change in sRaw over time.",
267
+ "\n\n\nMEASUREMENTS AND MAIN RESULTS\nContrary to the assumption that sRaw is independent of age and sex, boys had higher sRaw than girls (mean difference, 0.080; 95% confidence interval [CI], 0.049-0.111; P < 0.001) and a higher rate of increase over time.",
268
+ "For girls, sRaw increased by 0.017 kPa\u2009\u22c5\u2009s(-1) per year (95% CI, 0.011-0.023).",
269
+ "In boys this increase was significantly greater (P = 0.012; mean between-sex difference, 0.011 kPa\u2009\u22c5\u2009s(-1); 95% CI, 0.003-0.019).",
270
+ "Children with persistent wheeze (but not other wheeze phenotypes) had a significantly greater rate of deterioration in sRaw over time compared with never wheezers (P = 0.009).",
271
+ "Similarly, children with multiple early, but not other atopy phenotypes had significantly poorer lung function than those without atopy (mean difference, 0.116 kPa\u2009\u22c5\u2009s(-1); 95% CI, 0.065-0.168; P < 0.001).",
272
+ "sRaw increased progressively with the increasing number of asthma exacerbations.",
273
+ "\n\n\nCONCLUSIONS\nChildren with persistent wheeze, frequent asthma exacerbations, and multiple early atopy have diminished lung function throughout childhood, and are at risk of a progressive loss of lung function from age 3 to 11 years.",
274
+ "These effects are more marked in boys."
275
+ ]
276
+ },
277
+ {
278
+ "title": "Challenges in interpreting wheeze phenotypes: the clinical implications of statistical learning techniques.",
279
+ "abstract": [
280
+ "Aswe look into the new year, theATS leadership would like to take this opportunity to thank Jacob I. Sznajder, M.D., for his outstanding stewardship as Editor in Chief of the American Journal of Respiratory and Critical Care Medicine over the past 4 years.",
281
+ "We celebrate all he and his team have accomplished, including maintaining the premier status of the Journal in the fields of pulmonary disease, critical care medicine, and sleep-disordered breathing.",
282
+ "Dr. Sznajder has overseen numerous improvements to the Journal, including the addition of new, timely content such as concise clinical reviews, pulmonary perspectives, recommended readings by fellows, the image section, podcasts, videos, and downloadable PowerPoint figures.",
283
+ "All of his efforts have strengthened the reputation of the Journal, increased readability, and maintained the impact factor of 11.04, the highest impact factor of all journals in the fields of critical care medicine and the respiratory system.",
284
+ "Dr. Sznajder\u2019s term will end on December 31, 2014.",
285
+ "As we begin our search for a new Editor, per the search announcement at atsjournals.org, the Search Committee is seeking candidates who are active investigators in the field and have substantial editorial experience.",
286
+ "We look forward to receiving applications from candidates who possess all the virtues of Dr. Sznajder with a vision to lead the Journal into the future.",
287
+ "n"
288
+ ]
289
+ },
290
+ {
291
+ "title": "Genetic variants in endotoxin signalling pathway, domestic endotoxin exposure and asthma exacerbations",
292
+ "abstract": [
293
+ "We investigated the interaction between genetic variants in endotoxin signalling pathway and domestic endotoxin exposure in relation to asthma presence, and amongst children with asthma, we explored the association of these genetic variants and endotoxin exposure with hospital admissions due to asthma exacerbations."
294
+ ]
295
+ },
296
+ {
297
+ "title": "Developmental Pro les of Eczema , Wheeze , and Rhinitis : Two Population-Based Birth Cohort Studies",
298
+ "abstract": [
299
+ "Background The term \u201catopic march\u201d has been used to imply a natural progression of a cascade of symptoms from eczema to asthma and rhinitis through childhood.",
300
+ "We hypothesize that this expression does not adequately describe the natural history of eczema, wheeze, and rhinitis during childhood.",
301
+ "We propose that this paradigm arose from cross-sectional analyses of longitudinal studies, and may reflect a population pattern that may not predominate at the individual level.",
302
+ "Methods and Findings Data from 9,801 children in two population-based birth cohorts were used to determine individual profiles of eczema, wheeze, and rhinitis and whether the manifestations of these symptoms followed an atopic march pattern.",
303
+ "Children were assessed at ages 1, 3, 5, 8, and 11 y. We used Bayesian machine learning methods to identify distinct latent classes based on individual profiles of eczema, wheeze, and rhinitis.",
304
+ "This approach allowed us to identify groups of children with similar patterns of eczema, wheeze, and rhinitis over time.",
305
+ "Using a latent disease profile model, the data were best described by eight latent classes: no disease (51.3%), atopic march (3.1%), persistent eczema and wheeze (2.7%), persistent eczema with later-onset rhinitis (4.7%), persistent wheeze with later-onset rhinitis (5.7%), transient wheeze (7.7%), eczema only (15.3%), and rhinitis only (9.6%).",
306
+ "When latent variable Authors Metrics Comments Media Coverage Download PDF"
307
+ ]
308
+ },
309
+ {
310
+ "title": "Polymorphisms of endotoxin pathway and endotoxin exposure: in vitro IgE synthesis and replication in a birth cohort",
311
+ "abstract": [
312
+ "Genetic variants in endotoxin signaling pathway are important in modulating the effect of environmental endotoxin on asthma and atopic phenotypes.",
313
+ "Our objective was to determine the single nucleotide polymorphisms (SNPs) in the endotoxin signaling pathway that may influence in vitro IgE synthesis and to investigate the relationship between these variants and endotoxin exposure in relation to the development of asthma and atopy in a birth cohort."
314
+ ]
315
+ },
316
+ {
317
+ "title": "Impact of rhinitis on asthma severity in school-age children",
318
+ "abstract": [
319
+ "In a population\u2010based sample of school\u2010age children, we investigated factors associated with rhinitis, and differences between allergic and nonallergic rhinitis.",
320
+ "Amongst children with asthma, we explored the association between rhinitis and asthma severity."
321
+ ]
322
+ },
323
+ {
324
+ "title": "Multiple atopy phenotypes and their associations with asthma: similar findings from two birth cohorts",
325
+ "abstract": [
326
+ "Although atopic sensitization is one of the strongest risk factors for asthma, its relationship with asthma is poorly understood.",
327
+ "We hypothesize that \u2018atopy\u2019 encompasses multiple sub\u2010phenotypes that relate to asthma in different ways."
328
+ ]
329
+ },
330
+ {
331
+ "title": "Challenges in identifying asthma subgroups using unsupervised statistical learning techniques.",
332
+ "abstract": [
333
+ "RATIONALE\nUnsupervised statistical learning techniques, such as exploratory factor analysis (EFA) and hierarchical clustering (HC), have been used to identify asthma phenotypes, with partly consistent results.",
334
+ "Some of the inconsistency is caused by the variable selection and demographic and clinical differences among study populations.",
335
+ "\n\n\nOBJECTIVES\nTo investigate the effects of the choice of statistical method and different preparations of data on the clustering results; and to relate these to disease severity.",
336
+ "\n\n\nMETHODS\nSeveral variants of EFA and HC were applied and compared using various sets of variables and different encodings and transformations within a dataset of 383 children with asthma.",
337
+ "Variables included lung function, inflammatory and allergy markers, family history, environmental exposures, and medications.",
338
+ "Clusters and original variables were related to asthma severity (logistic regression and Bayesian network analysis).",
339
+ "\n\n\nMEASUREMENTS AND MAIN RESULTS\nEFA identified five components (eigenvalues \u2265 1) explaining 35% of the overall variance.",
340
+ "Variations of the HC (as linkage-distance functions) did not affect the cluster inference; however, using different variable encodings and transformations did.",
341
+ "The derived clusters predicted asthma severity less than the original variables.",
342
+ "Prognostic factors of severity were medication usage, current symptoms, lung function, paternal asthma, body mass index, and age of asthma onset.",
343
+ "Bayesian networks indicated conditional dependence among variables.",
344
+ "\n\n\nCONCLUSIONS\nThe use of different unsupervised statistical learning methods and different variable sets and encodings can lead to multiple and inconsistent subgroupings of asthma, not necessarily correlated with severity.",
345
+ "The search for asthma phenotypes needs more careful selection of markers, consistent across different study populations, and more cautious interpretation of results from unsupervised learning."
346
+ ]
347
+ },
348
+ {
349
+ "title": "Long-term Exposure to PM10 and NO2 in Association with Lung Volume and Airway Resistance in the MAAS Birth Cohort",
350
+ "abstract": [
351
+ "Background: Findings from previous studies on the effects of air pollution exposure on lung function during childhood have been inconsistent.",
352
+ "A common limitation has been the quality of exposure data used, and few studies have modeled exposure longitudinally throughout early life.",
353
+ "Objectives: We sought to study the long-term effects of exposure to particulate matter with an aerodynamic diameter \u2264 10 \u03bcm (PM10) and to nitrogen dioxide (NO2) on specific airway resistance (sRaw) and forced expiratory volume in 1 sec (FEV1) before and after bronchodilator treatment.",
354
+ "Subjects were from the Manchester Asthma and Allergy Study (MAAS) birth cohort (n = 1,185).",
355
+ "Methods: Spirometry was performed during clinic visits at ages 3, 5, 8, and 11 years.",
356
+ "Individual-level PM10 and NO2 exposures were estimated from birth to 11 years of age through a microenvironmental exposure model.",
357
+ "Longitudinal and cross-sectional associations were estimated using generalized estimating equations and multivariable linear regression models.",
358
+ "Results: Lifetime exposure to PM10 and NO2 was associated with significantly less growth in FEV1 (percent predicted) over time, both before (\u20131.37%; 95% CI: \u20132.52, \u20130.23 for a 1-unit increase in PM10 and \u20130.83%; 95% CI: \u20131.39, \u20130.28 for a 1-unit increase in NO2) and after bronchodilator treatment (\u20133.59%; 95% CI: \u20135.36, \u20131.83 and \u20131.20%; 95% CI: \u20131.97, \u20130.43, respectively).",
359
+ "We found no association between lifetime exposure and sRaw over time.",
360
+ "Cross-sectional analyses of detailed exposure estimates for the summer and winter before 11 years of age and lung function at 11 years indicated no significant associations.",
361
+ "Conclusions: Long-term PM10 and NO2 exposures were associated with small but statistically significant reductions in lung volume growth in children of elementary-school age.",
362
+ "Citation: M\u00f6lter A, Agius RM, de Vocht F, Lindley S, Gerrard W, Lowe L, Belgrave D, Custovic A, Simpson A. 2013.",
363
+ "Long-term exposure to PM10 and NO2 in association with lung volume and airway resistance in the MAAS birth cohort.",
364
+ "Environ Health Perspect 121:1232\u20131238.",
365
+ "http://dx.doi.org/10.1289/ehp.1205961"
366
+ ]
367
+ },
368
+ {
369
+ "title": "Characterizing wheeze phenotypes to identify endotypes of childhood asthma, and the implications for future management",
370
+ "abstract": [
371
+ "It is now a commonly held view that asthma is not a single disease, but rather a set of heterogeneous diseases sharing common symptoms.",
372
+ "One of the major challenges in treating asthma is understanding these different asthma phenotypes and their underlying biological mechanisms.",
373
+ "This review gives an epidemiological perspective of our current understanding of the different phenotypes that develop from birth to childhood that come under the umbrella term \u2018asthma\u2019.",
374
+ "The review focuses mainly on publications from longitudinal birth cohort studies where the natural history of asthma symptoms is observed over time in the whole population.",
375
+ "Identifying distinct pathophysiological mechanisms for these different phenotypes will potentially elucidate different asthma endotypes, ultimately leading to more effective treatment and management strategies."
376
+ ]
377
+ },
378
+ {
379
+ "title": "Challenges in interpreting allergen microarrays in relation to clinical symptoms: A machine learning approach",
380
+ "abstract": [
381
+ "Identifying different patterns of allergens and understanding their predictive ability in relation to asthma and other allergic diseases is crucial for the design of personalized diagnostic tools."
382
+ ]
383
+ },
384
+ {
385
+ "title": "Differing associations of BMI and body fat with asthma and lung function in children",
386
+ "abstract": [
387
+ "Current evidence suggests that in children there is a significant, albeit weak, association between asthma and obesity.",
388
+ "Studies generally use body mass index (BMI) in evaluating body adiposity, but there are limitations to its use."
389
+ ]
390
+ },
391
+ {
392
+ "title": "Genetic variation in vascular endothelial growth factor-a and lung function.",
393
+ "abstract": [
394
+ "RATIONALE\nGiven the role of vascular endothelial growth factor (VEGF) in lung development, we hypothesized that polymorphisms in VEGF-A may be associated with lung function.",
395
+ "\n\n\nOBJECTIVES\nThe current study was designed to assess the role of genetic variants in VEGF-A as determinants of airway function from infancy through early adulthood.",
396
+ "\n\n\nMETHODS\nAssociation between five single-nucleotide polymorphisms (SNPs) in VEGF-A and lung function were assessed longitudinally in two unselected birth cohorts and cross-sectionally among infants.",
397
+ "Replication with two SNPs was conducted in adults and children with asthma.",
398
+ "We investigated the functionality of the SNP most consistently associated with lung function (rs3025028) using Western blotting to measure the ratio of plasma VEGF-A(165b)/panVEGF-A(165) among homozygotes.",
399
+ "\n\n\nMEASUREMENTS AND MAIN RESULTS\nIn two populations in infancy, C-allele homozygotes of rs3025028 had significantly higher VmaxFRC, forced expiratory flow(50), and forced expiratory flow(25-75) compared with other genotype groups.",
400
+ "Among preschool children (age 3 yr), C allele of rs3025028 was associated with significantly higher specific airway conductance, with similar findings observed for lung function in school-age children.",
401
+ "For FEV(1)/FVC ratio similar findings were observed among adolescents and young adults (birth cohort), and then replicated in adults and schoolchildren with asthma (cross-sectional studies).",
402
+ "For rs3025038, plasma VEGF-A(165b)/panVEGF-A(165) was significantly higher among CC versus GG homozygotes (P \u2264 0.02) at birth, in school-age children, and in adults.",
403
+ "\n\n\nCONCLUSIONS\nWe report significant associations between VEGF-A SNP rs3025028 and parameters of airway function measured throughout childhood, with the effect persisting into adulthood.",
404
+ "We propose that the mechanism may be mediated through the ratios of active and inhibitory isoforms of VEGF-A(165), which may be determined by alternative splicing."
405
+ ]
406
+ },
407
+ {
408
+ "title": "Bayesian Machine Learning Approaches for Longitudinal Latent Class Modelling to Define Wheezing Phenotypes to Elucidate Environmental Associates",
409
+ "abstract": [
410
+ "Accurate phenotypic definition of wheezing in childhood can lead to a greater understanding of the distinct physiological markers associated with different wheeze phenotypes.",
411
+ "This paper looks at Bayesian machine learning approaches using Infer.",
412
+ "NET to define wheeze phenotypes based on both parental questionnaires and General Practitioner data on patterns of asthma and wheeze consultation within the first eight years of life.",
413
+ "We illustrate a taxonomy of longitudinal latent class item response models with varying modelling assumptions to determine wheeze phenotypes (latent classes) for homogenous groups of children."
414
+ ]
415
+ },
416
+ {
417
+ "title": "A longitudinal study investigating factors associated with changes in lung function over time in early life (age 3 to 11)",
418
+ "abstract": [
419
+ "Background: Previous studies have investigated factors associated with poor lung function in children.",
420
+ "However, these studies use a cross-sectional approach which ignores changes in lung function over time.",
421
+ "In this study, we develop multilevel longitudinal models in order to investigate factors affecting developmental change in lung function in early life.",
422
+ "\n\nMethods: In a population-based birth cohort 1185 participants were recruited prenatally and followed prospectively (1, 3, 5, 8 and 11 years).",
423
+ "At each time point, a validated questionnaire was administered to collect information on asthma-related symptoms, height and weight.",
424
+ "We assessed atopy and lung function (Specific Airway Resistance (sRaw), plethysmography) at each follow-up.",
425
+ "We use a longitudinal linear mixed models approach to determine predictors of change in sRaw over time.",
426
+ "\n\nResults: Univariate longitudinal analyses revealed marked deterioration in sRaw among children who were atopic (mean difference 2.85%, 95% CI 1.05%-9.84%, p=0.003).",
427
+ "Children who wheezed also had poorer lung function (mean difference 5.20%, 95% CI 0.87%-2.54%, p<0.001).",
428
+ "Boys had poorer lung function compared to girls (mean difference 3.48%, 95% CI 0.87%-6.52%, p=0.03) and also had a higher rate of deterioration of sRaw over time which increased by 0.011 units (p=0.012) per year.",
429
+ "In a multivariate longitudinal model, the factors which best predicted diminished lung function were atopy, increased BMI, paternal atopy, gender and current wheeze.",
430
+ "\n\nConclusion: Multilevel longitudinal models allow us to predict factors associated with diminished lung function as well as factors associated with change in lung function over time."
431
+ ]
432
+ },
433
+ {
434
+ "title": "A Comparison of Frequentist and Bayesian Approaches to Latent Class Modelling of Susceptibility to Asthma and Patterns of Antibiotic Prescriptions in Early Life Student Poster Presentation",
435
+ "abstract": [
436
+ "The assessment of patterns of antibiotic use in early life may have major implications for understanding the development of asthma.",
437
+ "This paper compares a classical generalized latent variable modelling framework and a Bayesian machine learning approach to define latent classes of susceptibility to asthma based on patterns of antibiotic use in early life.",
438
+ "We compare the potential advantages of each method for elucidating clinically meaningful phenotypes or classes."
439
+ ]
440
+ },
441
+ {
442
+ "title": "An international collaborative study to investigate a proposed reference method for the determination of potency measurements of fibrinolytics in absolute units",
443
+ "abstract": [
444
+ "Traditionally, WHO International Standards (IS) have been calibrated in International Units (IU) by consensus following an international collaborative study.",
445
+ "In the area of coagulation and fibrinolysis standards it is also common for laboratories involved in such studies to perform their own in-house methods, although guidelines may be defined to include recommendations of replication and randomization of sample testing to improve the robustness of the study.",
446
+ "The historic basis of this approach has been to develop a common reference standard to facilitate comparisons of results for the relative potency of standard and test preparations in laboratories using different methods [1].",
447
+ "However, this approach has been criticized and suggestions for improvements have been made which would bring the standardization of biologicals more in line with other calibrators used in medicinal chemistry.",
448
+ "Guidelines for the introduction of a metrologically sound approach to standardization have been detailed elsewhere [2].",
449
+ "General goals include standardization of methods and the introduction of a hierarchy of reference materials and procedures, each with an assigned uncertainty, to provide a system of metrological traceability where testing of routine samples can ultimately be traced back to a primary calibrator and primary reference method that are defined in SI units [3].",
450
+ "We have previously published a proposal for a reference method developed to measure the potency of thrombolytic products (plasminogen activators) [4] that did allow the calculation of activity in enzyme units (moles of product per second in the defined method), which could theoretically be converted into katals.",
451
+ "The katal (mole per second) is the coherent derived SI unit of measurement for enzyme activity and is at the top of the hierarchy for the catalytic concentration of an enzyme [5].",
452
+ "Thus it would be possible theoretically to define the concentration of an IS not only in IU but also by absolute SI units.",
453
+ "With the encouragement of the Fibrinolysis SSC, an international collaborative study was organized in which laboratories expert in fibrinolysis methods were recruited to perform the defined method [4] using the current IS for urokinase (uPA, 87/594), tissue plasminogen activator, (tPA, 98/714) and streptokinase (SK, 00/464).",
454
+ "The study was planned as far as possible to remove possible sources of variation.",
455
+ "All necessary reagents were provided, including IS, plasminogen substrate (NIBSC reagent 97/534), and chromogenic substrate for plasmin CS-41(03) (Hyphen BioMed, France).",
456
+ "In addition, thrombin (NIBSC reagent 01/578) and fibrinogen concentrate (1st IS 98/614) were provided in order to make clots as consistently as possible in all laboratories.",
457
+ "Plasmin (3rd IS, 97/ 536) was also provided to all laboratories to perform a series of assays in which a range of concentrations of chromogenic substrate was hydrolysed completely in order to calculate an extinction value for p-nitroaniline for each laboratory that was specific to the equipment they used.",
458
+ "This value was critical to calculate the molar concentration of p-nitroaniline released during the plasminogen activation reaction, which in turn allows the molar concentration and rate of plasmin generation to be calculated and thus express the activity of the plasminogen activator in SI units.",
459
+ "The only materials provided by the laboratories in the study were Tris buffer and microtitre plates.",
460
+ "A detailed collaborative study protocol was agreed in conjunction with participating laboratories and other outside interested parties over a series of months ahead of the practical phase of the study.",
461
+ "Twelve participants contributed a total of 36 assays, each of which included the three activators, uPA, tPA and SK at four doses, in quadruplicate for each point.",
462
+ "Raw data of absorbance vs. time were returned to NIBSC for analysis, where rates of plasminogen Correspondence: C. Longstaff, Haemostasis Section, National Institute for Biological Standards and Control, South Mimms, Herts, EN6 3QG, UK.",
463
+ "Tel.:",
464
+ "+44 1707 641253; fax: +44 1707 641050; e-mail: clongstaff@ nibsc.ac.uk"
465
+ ]
466
+ },
467
+ {
468
+ "title": "Standardising methodology in fibrinolysis assays: report of a collaborative study on a potential reference method for potency determinations of thrombolytics: On behalf of the Fibrinolysis Subcommittee of the Scientific and Standardization Committee, ISTH",
469
+ "abstract": [
470
+ "___________________ *correspondence to: C Longstaff, Dept of Haematology, NIBSC, Blanche Lane, South Mimms, Herts EN6 3QG, UK, Phone: 44 1707 641253, Fax: 44 1707 646730, Email: clongstaff@nibsc.ac.uk Haemostasis Section and Biostatistics Section, National Institute for Biological Standards and Control, South Mimms, Herts, UK EN63QG Report to Fibrinolysis SSC of the ISTH August 2005, Sydney, Australia Standardising methodology in fibrinolysis assays: report of a collaborative study on a potential reference method for potency determinations of thrombolytics:"
471
+ ]
472
+ },
473
+ {
474
+ "title": "Multicentre evaluation of stable reference whole blood for enumeration of lymphocyte subsets by flow cytometry.",
475
+ "abstract": [
476
+ "BACKGROUND: Clinical indications for lymphocyte subset enumeration by flow cytometry include monitoring of disease progression and timing of therapeutic intervention in infection with human immunodeficiency virus.",
477
+ "Until recently international standardisation has not been possible due to a lack of suitable stable reference material.",
478
+ "METHODS: This study consisted of two trials of a stabilised whole blood preparation.",
479
+ "Eleven participants were sent two standard protocols for staining plus gating strategy and asked to report absolute counts for lymphocyte subsets.",
480
+ "RESULTS: No significant difference was detected between the two methods when results from the two assays and all partners were pooled.",
481
+ "Significant differences in results from the different partners were observed.",
482
+ "However, representative mean counts were obtained for geometric means, geometric coefficient of variation, and 95% confidence interval for CD3 910 cells/mul, 9%, and 888 to 933, respectively), CD4 (495 cells/mul, 12%, and 483 to 507), and CD8 (408 cells/mul, 13%, and 393 to 422).",
483
+ "CONCLUSION: We have introduced a stabilised blood preparation and a well-characterized biological standard.",
484
+ "The availability of this reference material greatly simplifies the validation of new techniques for CD4(+) T-cell enumeration and the expansion of external quality assurance programmes for clinical laboratories, including those that operate in resource-restricted environments. (",
485
+ "c) 2005 Wiley-Liss, Inc."
486
+ ]
487
+ },
488
+ {
489
+ "title": "PROPOSED 1 st INTERNATIONAL STANDARD FOR FACTOR XIII, PLASMA (02/206) FINAL REPORT AND RECOMMENDATIONS",
490
+ "abstract": [
491
+ "SUMMARY An international collaborative study, involving 23 laboratories was carried out to calibrate the 1 st International Standard for factor XIII (FXIII) plasma.",
492
+ "This study also investigated the relationships between measurements of FXIII in concentrates vs plasma and between measurement of FXIII activity and FXIII antigen levels.",
493
+ "Furthermore, it also gave an opportunity to calibrate two SSC secondary coagulation plasma standards (Lot 2 & Lot 3) with FXIII activity (and FXIII:Ag) levels."
494
+ ]
495
+ },
496
+ {
497
+ "title": "Trajectories of childhood immune development and respiratory 1 health relevant to asthma and allergy 2 3",
498
+ "abstract": [
499
+ "3 Howard H.F. Tang 1,2 , Shu Mei Teo 1,3 , Danielle C.M. Belgrave 4 , Michael D. Evans 5 , Daniel J. 4 Jackson 5 , Marta Brozynska 1,4 , Merci M.H. Kusel 6 , Sebastian L. Johnston 7 , James E. Gern 5 , 5 Robert F. Lemanske 5 , Angela Simpson 8 , Adnan Custovic 4 , Peter D. Sly 6,9 , Patrick G. Holt 6,9 , 6 Kathryn E. Holt 10,11 , Michael Inouye 1,3,12 7 8 1 Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, 9 Melbourne, Victoria, Australia 10 2 School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia 11 3 Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary 12 Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom 13 4 Department of Paediatrics, Imperial College London, United Kingdom 14 5 University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA 15 6 Telethon Kids Institute, University of Western Australia, Perth, Western Australia, 16 Australia 17 7 Airway Disease Infection Section and MRC & Asthma UK Centre in Allergic Mechanisms 18 of Asthma, National Heart and Lung Institute, Imperial College London, Norfolk Place, 19 London, United Kingdom 20 8 Division of Infection, Immunity and Respiratory Medicine, The University of Manchester 21 9 Child Health Research Centre, The University of Queensland, Brisbane, Queensland, 22 Australia 23 10 Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, 24 Parkville, Victoria, Australia 25 11 The London School of Hygiene and Tropical Medicine, London WC1E 7HT, United 26 Kingdom.",
500
+ "27 12 The Alan Turing Institute, London, United Kingdom 28 29 30 * Correspondence: HHFT (Howard.",
501
+ "Tang@baker.edu.au) and MI 31 (mi336@medschl.cam.ac.uk) 32 33"
502
+ ]
503
+ }
504
+ ],
505
+ "user_kps": [
506
+ "active perception",
507
+ "allergens",
508
+ "biological insight",
509
+ "blood parameters",
510
+ "causal learning",
511
+ "clinical prediction models",
512
+ "cluster analyses",
513
+ "complex phenotypes",
514
+ "computational epidemiology",
515
+ "dataset bias",
516
+ "dsm models",
517
+ "flow cytometry",
518
+ "genome-wide association studies",
519
+ "genome-wide association study",
520
+ "longitudinal development",
521
+ "machine-learned models",
522
+ "personalized support",
523
+ "privacy-preserving data",
524
+ "regulatory pathways",
525
+ "respiratory diseases"
526
+ ]
527
+ }
data/users/ddowney/embeds-ddowney-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8e319501fd73137bcb490ff59d005bd4420b71f686b84c7756539f8ebabeed36
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2723ec380d35b80d06b4076e165aa536bd9861409d4535bea3a5af418ceacdd8
3
+ size 657536
data/users/ddowney/embeds-ddowney-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:499dac16cc0307a1b1bfdf37390253c8d9d79c31ce732172026fd94bd8d6f269
3
- size 449506
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6abb2313a7c2a0ca5fbffd5eb576f8e7a36cdfa17d4ebadc34fbc292a3c7af56
3
+ size 2201110
data/users/ddowney/pid2idx-ddowney-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49, "50": 50, "51": 51, "52": 52, "53": 53, "54": 54, "55": 55, "56": 56, "57": 57, "58": 58, "59": 59, "60": 60, "61": 61, "62": 62, "63": 63, "64": 64, "65": 65, "66": 66, "67": 67, "68": 68, "69": 69, "70": 70, "71": 71, "72": 72, "73": 73, "74": 74, "75": 75, "76": 76, "77": 77, "78": 78, "79": 79, "80": 80, "81": 81, "82": 82, "83": 83, "84": 84, "85": 85, "86": 86, "87": 87, "88": 88, "89": 89, "90": 90, "91": 91, "92": 92, "93": 93, "94": 94, "95": 95, "96": 96, "97": 97, "98": 98, "99": 99, "100": 100, "101": 101, "102": 102, "103": 103, "104": 104, "105": 105, "106": 106}
data/users/ddowney/seedset-ddowney-maple.json CHANGED
The diff for this file is too large to render. See raw diff
 
data/users/hzamani/embeds-hzamani-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e1428ed4467dfeed72df61072d30a6c5e3dcba3c2aa0023962bf25b35306190a
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:87e8b04691430294204d5cf6e50eb5792872a9585534dbc97717384abeb3cf6a
3
+ size 743552
data/users/hzamani/embeds-hzamani-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c23ba67484bd2cd37444eb257c3734d011fad4f6236decfbbc34a65358936a9a
3
- size 449506
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4bef9bb73dad05ccab18c6786b8002e4eefb68895591eca1312ba2954aca6d43
3
+ size 3031222
data/users/hzamani/pid2idx-hzamani-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49, "50": 50, "51": 51, "52": 52, "53": 53, "54": 54, "55": 55, "56": 56, "57": 57, "58": 58, "59": 59, "60": 60, "61": 61, "62": 62, "63": 63, "64": 64, "65": 65, "66": 66, "67": 67, "68": 68, "69": 69, "70": 70, "71": 71, "72": 72, "73": 73, "74": 74, "75": 75, "76": 76, "77": 77, "78": 78, "79": 79, "80": 80, "81": 81, "82": 82, "83": 83, "84": 84, "85": 85, "86": 86, "87": 87, "88": 88, "89": 89, "90": 90, "91": 91, "92": 92, "93": 93, "94": 94, "95": 95, "96": 96, "97": 97, "98": 98, "99": 99, "100": 100, "101": 101, "102": 102, "103": 103, "104": 104, "105": 105, "106": 106, "107": 107, "108": 108, "109": 109, "110": 110, "111": 111, "112": 112, "113": 113, "114": 114, "115": 115, "116": 116, "117": 117, "118": 118, "119": 119, "120": 120}
data/users/hzamani/seedset-hzamani-maple.json CHANGED
The diff for this file is too large to render. See raw diff
 
data/users/jbragg/embeds-jbragg-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0bac97ad7eb681742fae96ce40a510ff3b509bc9bc54d072de9ece15ec6f6769
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af96b8e27985a85f34e5bc31b623a0a690ef709b8061556a2d509112811df4d5
3
+ size 233600
data/users/jbragg/embeds-jbragg-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ef3f6b009202693bf93ef83a33d462b5de90386767f72e19b9f8cf068139f2ea
3
- size 421858
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1310b073816f0934d21a7d47cd55dc0def863158f0453c330363d89701828c8
3
+ size 818848
data/users/jbragg/pid2idx-jbragg-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37}
data/users/jbragg/seedset-jbragg-maple.json CHANGED
@@ -20,10 +20,10 @@
20
  "abstract": [
21
  "In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help scholars discover relevant papers and authors.",
22
  "Recent approaches have shown the usefulness of highlighting relevant authors while scholars engage in paper discovery.",
23
- "However, these systems do not capture and utilize users' evolving knowledge of authors.",
24
  "We reflect on the design space and introduce ComLittee, a literature discovery system that supports author-centric exploration.",
25
- "In contrast to paper-centric interaction in prior systems, ComLittee's author-centric interaction supports curation of research threads from individual authors, finding new authors and papers with combined signals from a paper recommender and the curated authors' authorship graphs, and understanding them in the context of those signals.",
26
- "In a within-subjects experiment that compares to an author-highlighting approach, we demonstrate how ComLittee leads to a higher efficiency, quality, and novelty in author discovery that also improves paper discovery."
27
  ]
28
  },
29
  {
@@ -31,19 +31,30 @@
31
  "abstract": [
32
  "When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work.",
33
  "However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature reviews.",
34
- "This paper introduces CiteSee, a paper reading tool that leverages a user's publishing, reading, and saving activities to provide personalized visual augmentations and context around citations.",
35
  "First, CiteSee connects the current paper to familiar contexts by surfacing known citations a user had cited or opened.",
36
  "Second, CiteSee helps users prioritize their exploration by highlighting relevant but unknown citations based on saving and reading history.",
37
  "We conducted a lab study that suggests CiteSee is significantly more effective for paper discovery than three baselines.",
38
  "A field deployment study shows CiteSee helps participants keep track of their explorations and leads to better situational awareness and increased paper discovery via inline citation when conducting real-world literature reviews."
39
  ]
40
  },
 
 
 
 
 
 
 
 
 
 
 
41
  {
42
  "title": "Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections",
43
  "abstract": [
44
  "Scholars who want to research a scientific topic must take time to read, extract meaning, and identify connections across many papers.",
45
  "As scientific literature grows, this becomes increasingly challenging.",
46
- "Meanwhile, authors summarize prior research in papers' related work sections, though this is scoped to support a single paper.",
47
  "A formative study found that while reading multiple related work paragraphs helps overview a topic, it is hard to navigate overlapping and diverging references and research foci.",
48
  "In this work, we design a system, Relatedly, that scaffolds exploring and reading multiple related work paragraphs on a topic, with features including dynamic re-ranking and highlighting to spotlight unexplored dissimilar information, auto-generated descriptive paragraph headings, and low-lighting of redundant information.",
49
  "From a within-subjects user study (n=15), we found that scholars generate more coherent, insightful, and comprehensive topic outlines using Relatedly compared to a baseline paper list."
@@ -72,6 +83,31 @@
72
  "We sketch three components for AI support design and discuss considerations for future research."
73
  ]
74
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
  {
76
  "title": "A Dataset of Alt Texts from HCI Publications: Analyses and Uses Towards Producing More Descriptive Alt Texts of Data Visualizations in Scientific Papers",
77
  "abstract": [
@@ -117,17 +153,6 @@
117
  "Our qualitative results also highlight people\u2019s preference for this more effective exploratory search experience enabled by FeedLens."
118
  ]
119
  },
120
- {
121
- "title": "Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing",
122
- "abstract": [
123
- "When seeking information not covered in patient-friendly documents, healthcare consumers may turn to the research literature.",
124
- "Reading medical papers, however, can be a challenging experience.",
125
- "To improve access to medical papers, we explore four features enabled by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guides readers to answering passages, and plain language summaries of those passages.",
126
- "We embody these features into a prototype system, Paper Plain.",
127
- "We evaluate Paper Plain, finding that participants who used the prototype system had an easier time reading research papers without a loss in paper comprehension compared to those who used a typical PDF reader.",
128
- "Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers."
129
- ]
130
- },
131
  {
132
  "title": "Scim: Intelligent Skimming Support for Scientific Papers",
133
  "abstract": [
@@ -139,6 +164,17 @@
139
  "We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools."
140
  ]
141
  },
 
 
 
 
 
 
 
 
 
 
 
142
  {
143
  "title": "From Who You Know to What You Read: Augmenting Scientific Recommendations with Implicit Social Networks",
144
  "abstract": [
@@ -231,35 +267,237 @@
231
  "title": "GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation",
232
  "abstract": [
233
  "Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository.",
234
- "Their adoption, however, is so far limited to tasks which can be reli-ably evaluated in an automatic manner.",
235
  "This work introduces G ENIE , an extensible human evaluation leaderboard, which brings the ease of leaderboards to text generation tasks.",
236
  "G E - NIE automatically posts leaderboard submissions to crowdsourcing platforms asking human annotators to evaluate them on various axes (e.g., correctness, conciseness, \ufb02uency), and compares their answers to various automatic metrics.",
237
  "We introduce several datasets in English to G ENIE , representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension.",
238
  "We provide formal granular evaluation metrics and identify areas for future research.",
239
  "We make G ENIE publicly available, 1 and hope that it will spur progress in language generation models as well as their automatic and manual evaluation."
240
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
241
  }
242
  ],
243
  "user_kps": [
244
- "accessible user interfaces",
245
  "attentive user interfaces",
246
- "biomedical literature mining",
247
  "braille documents",
 
248
  "citation context analysis",
249
  "citation network",
250
- "collaborative writing",
251
- "document representations",
 
252
  "exploratory searches",
253
  "few-shot learning",
254
- "hl7 's clinical document architecture",
255
- "human annotations",
256
  "human readability",
257
  "human readers",
258
  "information accessibility",
259
  "literature-based discovery",
 
 
 
260
  "natural language generation",
261
- "scholarly communication",
262
- "text comprehension",
263
  "textual interface"
264
  ]
265
  }
 
20
  "abstract": [
21
  "In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help scholars discover relevant papers and authors.",
22
  "Recent approaches have shown the usefulness of highlighting relevant authors while scholars engage in paper discovery.",
23
+ "However, these systems do not capture and utilize users\u2019 evolving knowledge of authors.",
24
  "We reflect on the design space and introduce ComLittee, a literature discovery system that supports author-centric exploration.",
25
+ "In contrast to paper-centric interaction in prior systems, ComLittee\u2019s author-centric interaction supports curating research threads from individual authors, finding new authors and papers using combined signals from a paper recommender and the curated authors\u2019 authorship graphs, and understanding them in the context of those signals.",
26
+ "In a within-subjects experiment that compares to a paper-centric discovery system with author-highlighting, we demonstrate how ComLittee improves author and paper discovery."
27
  ]
28
  },
29
  {
 
31
  "abstract": [
32
  "When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work.",
33
  "However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature reviews.",
34
+ "This paper introduces CiteSee, a paper reading tool that leverages a user\u2019s publishing, reading, and saving activities to provide personalized visual augmentations and context around citations.",
35
  "First, CiteSee connects the current paper to familiar contexts by surfacing known citations a user had cited or opened.",
36
  "Second, CiteSee helps users prioritize their exploration by highlighting relevant but unknown citations based on saving and reading history.",
37
  "We conducted a lab study that suggests CiteSee is significantly more effective for paper discovery than three baselines.",
38
  "A field deployment study shows CiteSee helps participants keep track of their explorations and leads to better situational awareness and increased paper discovery via inline citation when conducting real-world literature reviews."
39
  ]
40
  },
41
+ {
42
+ "title": "Papeos: Augmenting Research Papers with Talk Videos",
43
+ "abstract": [
44
+ "Research consumption has been traditionally limited to the reading of academic papers\u2014a static, dense, and formally written format.",
45
+ "Alternatively, pre-recorded conference presentation videos, which are more dynamic, concise, and colloquial, have recently become more widely available but potentially under-utilized.",
46
+ "In this work, we explore the design space and benefits for combining academic papers and talk videos to leverage their complementary nature to provide a rich and fluid research consumption experience.",
47
+ "Based on formative and co-design studies, we present Papeos, a novel reading and authoring interface that allow authors to augment their papers by segmenting and localizing talk videos alongside relevant paper passages with automatically generated suggestions.",
48
+ "With Papeos, readers can visually skim a paper through clip thumbnails, and fluidly switch between consuming dense text in the paper or visual summaries in the video.",
49
+ "In a comparative lab study (n=16), Papeos reduced mental load, scaffolded navigation, and facilitated more comprehensive reading of papers."
50
+ ]
51
+ },
52
  {
53
  "title": "Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections",
54
  "abstract": [
55
  "Scholars who want to research a scientific topic must take time to read, extract meaning, and identify connections across many papers.",
56
  "As scientific literature grows, this becomes increasingly challenging.",
57
+ "Meanwhile, authors summarize prior research in papers\u2019 related work sections, though this is scoped to support a single paper.",
58
  "A formative study found that while reading multiple related work paragraphs helps overview a topic, it is hard to navigate overlapping and diverging references and research foci.",
59
  "In this work, we design a system, Relatedly, that scaffolds exploring and reading multiple related work paragraphs on a topic, with features including dynamic re-ranking and highlighting to spotlight unexplored dissimilar information, auto-generated descriptive paragraph headings, and low-lighting of redundant information.",
60
  "From a within-subjects user study (n=15), we found that scholars generate more coherent, insightful, and comprehensive topic outlines using Relatedly compared to a baseline paper list."
 
83
  "We sketch three components for AI support design and discuss considerations for future research."
84
  ]
85
  },
86
+ {
87
+ "title": "RCT Rejection Sampling for Causal Estimation Evaluation",
88
+ "abstract": [
89
+ "Confounding is a significant obstacle to unbiased estimation of causal effects from observational data.",
90
+ "For settings with high-dimensional covariates -- such as text data, genomics, or the behavioral social sciences -- researchers have proposed methods to adjust for confounding by adapting machine learning methods to the goal of causal estimation.",
91
+ "However, empirical evaluation of these adjustment methods has been challenging and limited.",
92
+ "In this work, we build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data: subsampling randomized controlled trials (RCTs) to create confounded observational datasets while using the average causal effects from the RCTs as ground-truth.",
93
+ "We contribute a new sampling algorithm, which we call RCT rejection sampling, and provide theoretical guarantees that causal identification holds in the observational data to allow for valid comparisons to the ground-truth RCT.",
94
+ "Using synthetic data, we show our algorithm indeed results in low bias when oracle estimators are evaluated on the confounded samples, which is not always the case for a previously proposed algorithm.",
95
+ "In addition to this identification result, we highlight several finite data considerations for evaluation designers who plan to use RCT rejection sampling on their own datasets.",
96
+ "As a proof of concept, we implement an example evaluation pipeline and walk through these finite data considerations with a novel, real-world RCT -- which we release publicly -- consisting of approximately 70k observations and text data as high-dimensional covariates.",
97
+ "Together, these contributions build towards a broader agenda of improved empirical evaluation for causal estimation."
98
+ ]
99
+ },
100
+ {
101
+ "title": "ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews",
102
+ "abstract": [
103
+ "Revising scientific papers based on peer feedback is a challenging task that requires not only deep scientific knowledge and reasoning, but also the ability to recognize the implicit requests in high-level feedback and to choose the best of many possible ways to update the manuscript in response.",
104
+ "We introduce this task for large language models and release ARIES, a dataset of review comments and their corresponding paper edits, to enable training and evaluating models.",
105
+ "We study two versions of the task: comment-edit alignment and edit generation, and evaluate several baselines, including GPT-4.",
106
+ "We find that models struggle even to identify the edits that correspond to a comment, especially in cases where the comment is phrased in an indirect way or where the edit addresses the spirit of a comment but not the precise request.",
107
+ "When tasked with generating edits, GPT-4 often succeeds in addressing comments on a surface level, but it rigidly follows the wording of the feedback rather than the underlying intent, and includes fewer technical details than human-written edits.",
108
+ "We hope that our formalization, dataset, and analysis will form a foundation for future work in this area."
109
+ ]
110
+ },
111
  {
112
  "title": "A Dataset of Alt Texts from HCI Publications: Analyses and Uses Towards Producing More Descriptive Alt Texts of Data Visualizations in Scientific Papers",
113
  "abstract": [
 
153
  "Our qualitative results also highlight people\u2019s preference for this more effective exploratory search experience enabled by FeedLens."
154
  ]
155
  },
 
 
 
 
 
 
 
 
 
 
 
156
  {
157
  "title": "Scim: Intelligent Skimming Support for Scientific Papers",
158
  "abstract": [
 
164
  "We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools."
165
  ]
166
  },
167
+ {
168
+ "title": "Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing",
169
+ "abstract": [
170
+ "When seeking information not covered in patient-friendly documents, healthcare consumers may turn to the research literature.",
171
+ "Reading medical papers, however, can be a challenging experience.",
172
+ "To improve access to medical papers, we explore four features enabled by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guides readers to answering passages, and plain language summaries of those passages.",
173
+ "We embody these features into a prototype system, Paper Plain.",
174
+ "We evaluate Paper Plain, finding that participants who used the prototype system had an easier time reading research papers without a loss in paper comprehension compared to those who used a typical PDF reader.",
175
+ "Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers."
176
+ ]
177
+ },
178
  {
179
  "title": "From Who You Know to What You Read: Augmenting Scientific Recommendations with Implicit Social Networks",
180
  "abstract": [
 
267
  "title": "GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation",
268
  "abstract": [
269
  "Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository.",
270
+ "Their adoption, however, is so far limited to tasks which can be reliably evaluated in an automatic manner.",
271
  "This work introduces G ENIE , an extensible human evaluation leaderboard, which brings the ease of leaderboards to text generation tasks.",
272
  "G E - NIE automatically posts leaderboard submissions to crowdsourcing platforms asking human annotators to evaluate them on various axes (e.g., correctness, conciseness, \ufb02uency), and compares their answers to various automatic metrics.",
273
  "We introduce several datasets in English to G ENIE , representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension.",
274
  "We provide formal granular evaluation metrics and identify areas for future research.",
275
  "We make G ENIE publicly available, 1 and hope that it will spur progress in language generation models as well as their automatic and manual evaluation."
276
  ]
277
+ },
278
+ {
279
+ "title": "Fake It Till You Make It: Learning-Compatible Performance Support",
280
+ "abstract": [
281
+ "A longstanding goal of artificial intelligence is to develop technologies that augment or assist humans.",
282
+ "Current approaches to developing agents that can assist humans focus on adapting behavior of the assistant, and do not consider the potential for assistants to support human learning.",
283
+ "We argue that in many cases, it is worthwhile to provide assistance in a manner that also promotes task learning or skill maintenance.",
284
+ "We term such assistance Learning-Compatible Performance Support, and present the Stochastic Q Bumpers algorithm for greatly improving learning outcomes while still providing high levels of performance support.",
285
+ "We demonstrate the effectiveness of our approach in multiple domains with simulated learners, including a complex flight control task."
286
+ ]
287
+ },
288
+ {
289
+ "title": "Sprout: Crowd-Powered Task Design for Crowdsourcing",
290
+ "abstract": [
291
+ "While crowdsourcing enables data collection at scale, ensuring high-quality data remains a challenge.",
292
+ "In particular, effective task design underlies nearly every reported crowdsourcing success, yet remains difficult to accomplish.",
293
+ "Task design is hard because it involves a costly iterative process: identifying the kind of work output one wants, conveying this information to workers, observing worker performance, understanding what remains ambiguous, revising the instructions, and repeating the process until the resulting output is satisfactory.",
294
+ "To facilitate this process, we propose a novel meta-workflow that helps requesters optimize crowdsourcing task designs and Sprout, our open-source tool, which implements this workflow.",
295
+ "Sprout improves task designs by (1) eliciting points of confusion from crowd workers, (2) enabling requesters to quickly understand these misconceptions and the overall space of questions, and (3) guiding requesters to improve the task design in response.",
296
+ "We report the results of a user study with two labeling tasks demonstrating that requesters strongly prefer Sprout and produce higher-rated instructions compared to current best practices for creating gated instructions (instructions plus a workflow for training and testing workers).",
297
+ "We also offer a set of design recommendations for future tools that support crowdsourcing task design."
298
+ ]
299
+ },
300
+ {
301
+ "title": "Self-Improving Crowdsourcing: Near-Effortless Design of Adaptive Distributed Work",
302
+ "abstract": [
303
+ "Self-Improving Crowdsourcing: Near-Effortless Design of Adaptive Distributed Work"
304
+ ]
305
+ },
306
+ {
307
+ "title": "Subcontracting Microwork",
308
+ "abstract": [
309
+ "Mainstream crowdwork platforms treat microtasks as indivisible units; however, in this article, we propose that there is value in re-examining this assumption.",
310
+ "We argue that crowdwork platforms can improve their value proposition for all stakeholders by supporting subcontracting within microtasks.",
311
+ "After describing the value proposition of subcontracting, we then define three models for microtask subcontracting: real-time assistance, task management, and task improvement, and reflect on potential use cases and implementation considerations associated with each.",
312
+ "Finally, we describe the outcome of two tasks on Mechanical Turk meant to simulate aspects of subcontracting.",
313
+ "We reflect on the implications of these findings for the design of future crowd work platforms that effectively harness the potential of subcontracting workflows."
314
+ ]
315
+ },
316
+ {
317
+ "title": "Worker-Owned Cooperative Models for Training Artificial Intelligence",
318
+ "abstract": [
319
+ "Artificial intelligence (AI) is widely expected to reduce the need for human labor in a variety of sectors.",
320
+ "Workers on virtual labor marketplaces accelerate this process by generating training data for AI systems.",
321
+ "We propose a new model where workers earn ownership of trained AI systems, allowing them to draw a long-term royalty from a tool that replaces their labor.",
322
+ "This concept offers benefits for workers and requesters alike, reducing the upfront costs of model training while increasing longer-term rewards to workers.",
323
+ "We identify design and technical problems associated with this new concept, including finding market opportunities for trained models, financing model training, and compensating workers fairly for training contributions.",
324
+ "A survey of workers on Amazon Mechanical Turk about this idea finds that workers are willing to give up 25% of their earnings in exchange for an investment in the future performance of a machine learning system."
325
+ ]
326
+ },
327
+ {
328
+ "title": "Optimal Testing for Crowd Workers",
329
+ "abstract": [
330
+ "Requesters on crowdsourcing platforms, such as Amazon Mechanical Turk, routinely insert gold questions to verify that a worker is diligent and is providing high-quality answers.",
331
+ "However, there is no clear understanding of when and how many gold questions to insert.",
332
+ "Typically, requesters mix a flat 10-30% of gold questions into the task stream of every worker.",
333
+ "This static policy is arbitrary and wastes valuable budget --- the exact percentage is often chosen with little experimentation, and, more importantly, it does not adapt to individual workers, the current mixture of spamming vs. diligent workers, or the number of tasks workers perform before quitting.",
334
+ "\n \nWe formulate the problem of balancing between (1) testing workers to determine their accuracy and (2) actually getting work performed as a partially-observable Markov decision process (POMDP) and apply reinforcement learning to dynamically calculate the best policy.",
335
+ "Evaluations on both synthetic data and with real Mechanical Turk workers show that our agent learns adaptive testing policies that produce up to 111% more reward than the non-adaptive policies used by most requesters.",
336
+ "Furthermore, our method is fully automated, easy to apply, and runs mostly out of the box."
337
+ ]
338
+ },
339
+ {
340
+ "title": "MicroTalk: Using Argumentation to Improve Crowdsourcing Accuracy",
341
+ "abstract": [
342
+ "\n \n Crowd workers are human and thus sometimes make mistakes.",
343
+ "In order to ensure the highest quality output, requesters often issue redundant jobs with gold test questions and sophisticated aggregation mechanisms based on expectation maximization (EM).",
344
+ "While these methods yield accurate results in many cases, they fail on extremely difficult problems with local minima, such as situations where the majority of workers get the answer wrong.",
345
+ "Indeed, this has caused some researchers to conclude that on some tasks crowdsourcing can never achieve high accuracies, no matter how many workers are involved.",
346
+ "This paper presents a new quality-control workflow, called MicroTalk, that requires some workers to Justify their reasoning and asks others to Reconsider their decisions after reading counter-arguments from workers with opposing views.",
347
+ "Experiments on a challenging NLP annotation task with workers from Amazon Mechanical Turk show that (1) argumentation improves the accuracy of individual workers by 20%, (2) restricting consideration to workers with complex explanations improves accuracy even more, and (3) our complete MicroTalk aggregation workflow produces much higher accuracy than simpler voting approaches for a range of budgets.",
348
+ "\n \n"
349
+ ]
350
+ },
351
+ {
352
+ "title": "Toward Automatic Bootstrapping of Online Communities Using Decision-theoretic Optimization",
353
+ "abstract": [
354
+ "Successful online communities (e.g., Wikipedia, Yelp, and StackOverflow) can produce valuable content.",
355
+ "However, many communities fail in their initial stages.",
356
+ "Starting an online community is challenging because there is not enough content to attract a critical mass of active members.",
357
+ "This paper examines methods for addressing this cold-start problem in datamining-bootstrappable communities by attracting non-members to contribute to the community.",
358
+ "We make four contributions: 1) we characterize a set of communities that are \u201cdatamining-bootstrappable\u201d and define the bootstrapping problem in terms of decision-theoretic optimization, 2) we estimate the model parameters in a case study involving the Open AI Resources website, 3) we demonstrate that non-members' predicted interest levels and request design are important features that can significantly affect the contribution rate, and 4) we ran a simulation experiment using data generated with the learned parameters and show that our decision-theoretic optimization algorithm can generate as much community utility when bootstrapping the community as our strongest baseline while issuing only 55% as many contribution requests."
359
+ ]
360
+ },
361
+ {
362
+ "title": "Effective Crowd Annotation for Relation Extraction",
363
+ "abstract": [
364
+ "Can crowdsourced annotation of training data boost performance for relation extraction over methods based solely on distant supervision?",
365
+ "While crowdsourcing has been shown effective for many NLP tasks, previous researchers found only minimal improvement when applying the method to relation extraction.",
366
+ "This paper demonstrates that a much larger boost is possible, e.g., raising F1 from 0.40 to 0.60.",
367
+ "Furthermore, the gains are due to a simple, generalizable technique, Gated Instruction , which combines an interactive tutorial, feedback to correct errors during training, and improved screening."
368
+ ]
369
+ },
370
+ {
371
+ "title": "Learning on the Job: Optimal Instruction for Crowdsourcing",
372
+ "abstract": [
373
+ "A large body of crowdsourcing research focuses on using techniques from artificial intelligence to improve estimates of latent answers to questions, assuming fixed (latent) worker quality.",
374
+ "Recently, researchers have begun to investigate how best to actively improve worker quality through instruction (Basu & Christensen, 2013; Singla et al., 2014).",
375
+ "However, none of the existing work considers the fundamental tradeoff between providing instruction and getting actual work done.",
376
+ "In this work, we present a reinforcement learning agent capable of optimizing the instruction it provides, by learning the effectiveness of its teaching actions, the quality of the worker population, and the amount of work output it can expect from individual workers.",
377
+ "Evaluations on synthetic data show that our agent learns adaptive instruction policies that significantly outperform common baseline strategies such as providing a tutorial of fixed length."
378
+ ]
379
+ },
380
+ {
381
+ "title": "Parallel Task Routing for Crowdsourcing",
382
+ "abstract": [
383
+ "\n \n An ideal crowdsourcing or citizen-science system would route tasks to the most appropriate workers, but the best assignment is unclear because workers have varying skill, tasks have varying difficulty, and assigning several workers to a single task may significantly improve output quality.",
384
+ "This paper defines a space of task routing problems, proves that even the simplest is NP-hard, and develops several approximation algorithms for parallel routing problems.",
385
+ "We show that an intuitive class of requesters' utility functions is submodular, which lets us provide iterative methods for dynamically allocating batches of tasks that make near-optimal use of available workers in each round.",
386
+ "Experiments with live oDesk workers show that our task routing algorithm uses only 48% of the human labor compared to the commonly used round-robin strategy.",
387
+ "Further, we provide versions of our task routing algorithm which enable it to scale to large numbers of workers and questions and to handle workers with variable response times while still providing significant benefit over common baselines.",
388
+ "\n \n"
389
+ ]
390
+ },
391
+ {
392
+ "title": "Artificial Intelligence and Collective Intelligence",
393
+ "abstract": [
394
+ "The vision of artificial intelligence (AI) is often manifested through an autonomous software module (agent) in a complex and uncertain environment.",
395
+ "The agent is capable of thinking ahead and acting for long periods of time in accordance with its goals/objectives.",
396
+ "It is also capable of learning and refining its understanding of the world.",
397
+ "The agent may accomplish this based on its own experience, or from the feedback provided by humans.",
398
+ "Famous recent examples include self-driving cars (Thrun 2006) and the IBM Jeopardy player Watson (Ferrucci et al. 2010).",
399
+ "This chapter explores the immense value of AI techniques for collective intelligence, including ways to make interactions between large numbers of humans more efficient.",
400
+ "By defining collective intelligence as \u201cgroups of individuals acting collectively in an intelligent manner,\u201d one soon wishes to nail down the meaning of individual.",
401
+ "In this chapter, individuals may be software agents and/or people and the collective may consist of a mixture of both.",
402
+ "The rise of collective intelligence allows novel possibilities of seamlessly integrating machine and human intelligence at a large scale \u2013 one of the holy grails of AI (known in the literature as mixed-initiative systems (Horvitz 2007)).",
403
+ "Our chapter focuses on one such integration \u2013 the use of machine intelligence for the management of crowdsourcing platforms (Weld, Mausam, and Dai 2011).",
404
+ "Crowdsourcing is a special case of collective intelligence, where a third party (called the requestor) with some internal objective solicits a group of individuals (called workers) to perform a set of inter-related tasks in service of that objective.",
405
+ "The requestor\u2019s objective may be expressed in the form of a utility function to be maximized.",
406
+ "For example, a requestor might wish to obtain labels for a large set of images; in this case, her utility function might be the average quality of labels subject to a constraint that no more than $ X dollars be spent paying workers.",
407
+ "We assume that the workers act independently, interacting only through the shared tasks.",
408
+ "Each worker has an individual utility function, which is often different from the collective\u2019s utility function.",
409
+ "Furthermore, we assume that their utility functions are independent of each other.",
410
+ "The AI subfield of multi-agent systems considers even richer models, in which individual agents may reason about the objectives of other agents, negotiate, and bargain with each other (Weiss 2013).",
411
+ "We won\u2019t discuss these techniques"
412
+ ]
413
+ },
414
+ {
415
+ "title": "Crowdsourcing Multi-Label Classification for Taxonomy Creation",
416
+ "abstract": [
417
+ "\n \n Recent work has introduced CASCADE, an algorithm for creating a globally-consistent taxonomy by crowdsourcing microwork from many individuals, each of whom may see only a tiny fraction of the data (Chilton et al. 2013).",
418
+ "While CASCADE needs only unskilled labor and produces taxonomies whose quality approaches that of human experts, it uses significantly more labor than experts.",
419
+ "This paper presents DELUGE, an improved workflow that produces taxonomies with comparable quality using significantly less crowd labor.",
420
+ "Specifically, our method for crowdsourcing multi-label classification optimizes CASCADE\u2019s most costly step (categorization) using less than 10% of the labor required by the original approach.",
421
+ "DELUGE\u2019s savings come from the use of decision theory and machine learning, which allow it to pose microtasks that aim to maximize information gain.",
422
+ "\n \n"
423
+ ]
424
+ },
425
+ {
426
+ "title": "Neo-Riemannian Cycle Detection with Weighted Finite-State Transducers",
427
+ "abstract": [
428
+ "This paper proposes a finite-state model for detecting harmonic cycles as described by neo-Riemannian theorists.",
429
+ "Given a string of triads representing a harmonic analysis of a piece, the task is to identify and label all substrings corresponding to these cycles with high accuracy.",
430
+ "The solution method uses a noisy channel model implemented with weighted finitestate transducers.",
431
+ "On a dataset of four works by Franz Schubert, our model predicted cycles in the same regions as cycles in the ground truth with a precision of 0.18 and a recall of 1.0.",
432
+ "The recalled cycles had an average edit distance of 3.2 insertions or deletions from the ground truth cycles, which average 6.4 labeled triads in length.",
433
+ "We suggest ways in which our model could be used to contribute to current work in music theory, and be generalized to other music pattern-finding applications."
434
+ ]
435
+ },
436
+ {
437
+ "title": "Mathematics and Computation in Music 2009: John Clough Memorial Conference (review)",
438
+ "abstract": [
439
+ "adjusted to the darkness, and on the somber projected video I could just make out the shine on the forehead of a young black boy with a pale, disembodied ivory child\u2019s hand placed diagonally across his breast.",
440
+ "The boy is seated beside a window with alternating hues behind its panes.",
441
+ "My ears gradually blended out the sounds from beyond the heavy velvet entry curtain.",
442
+ "Concentrating on the audio track I heard something like bongo slaps, arrhythmic, paced far from each other.",
443
+ "The illuminated \u201croom\u201d in front of me became lighter, colors behind the window changed, and a colored wall appeared.",
444
+ "Gradually, the crowd left, exposing directly opposite the filmed boy a very dimly lit, pale ivory ceramic statue of a diminutive girl extending her hand.",
445
+ "The room had by now almost emptied\u2014ah, against the wall, two chairs under a faint spotlight.",
446
+ "On the projected window, unfocused scenes take place between black-outs.",
447
+ "What was happening below the window?",
448
+ "If I sat on the installed chair, would I be accessory to it?",
449
+ "The bongo slaps became louder and faster, an occasional something reminding of a distant, filtered lion\u2019s roar instrument added.",
450
+ "Seated, I could now look around and suddenly perceive what had been there all along: directly opposite me a real window, high on the wall, with colors painted on the closed wall behind it.",
451
+ "The window, too, was spotlighted, as was I on my seat.",
452
+ "I must remain outside the window, I can enter neither the wall window in front of me nor the film window to my left.",
453
+ "To my right, the dimly lit girl.",
454
+ "I could only depend on the acoustic signals and changing colors to orient myself in the goings-on of this space.",
455
+ "Faster and more rhythmically, the drum was hit, struck, punched; very loud and almost rhythmic, on the edge of violence.",
456
+ "No, that is no longer a lion\u2019s roar, and it\u2019s so soft\u2014 that\u2019s a child\u2019s whimpering, isn\u2019t it?",
457
+ "The hitting stopped, the whimpering took on a whining animal tone.",
458
+ "Then the images behind the window expanded beyond the boundaries of the window, taking up the whole projection space of the wall.",
459
+ "Buildings in blue, exploding clouds, then the slow bongo slaps resume.",
460
+ "A man\u2019s head without eyeballs, it is quiet.",
461
+ "The projected window disappears, the boy disappears.",
462
+ "We were left in darkness.",
463
+ "With its continually changing colors, sounds, and lighting, Andro Wekua\u2019s installation By the Window (8 min 30 sec, 2008, Gladstone Gallery) uses varying media to direct my attention, my perception, and to evoke emotions.",
464
+ "Through the masterful and exigent use of them, he transports the notions evoked from the relationship between the individual sitting on the chair and the projected individual, toward inclusion of the outsider of another color standing in the room.",
465
+ "The closed window on the wall heightens consciousness of being enclosed, closed off, in the room, while through the sequence of sound and enlargement of the images a larger community and its doom were brought to mind.",
466
+ "In Mr. Wekua\u2019s piece, I initially analyzed the sounds, their nature, their potential meaning.",
467
+ "But, through the density of the combined medial input (not the density of the sound, for with the exception of the \u201capotheosis,\u201d it remained minimal), I was overwhelmed by multiple narratives, none of which was conclusive.",
468
+ "The sound and image design were reminiscent of David Lynch.",
469
+ "At times, the music narrative dominated, at times the visual; at times they were in tandem, at times the room was black or silent.",
470
+ "By conjuring its narratives without the use of spoken or graphic words, the work managed to keep the narratives in a non-verbal part of the brain, together with the emotional reactions to the non-verbal (atavistic) audio and video inputs, all of which maintained a precarious balance between their forces.",
471
+ "The over-activation of input and inconclusive narratives caused me to shudder at the end.",
472
+ "It was not a shudder of catharsis; I was shaken.",
473
+ "Not surprisingly, I noted this year a dearth of works that were joyous, playfully interactive, or exquisitely breathtaking.",
474
+ "2009 has been a year of gloom.",
475
+ "But here it hardly matters, for I have been rubbing elbows with art: with Henri Matisse, Joan Mir\u00f3, and Joan Mitchell; with Paul Klee, Yves Klein, and Gustav Klimt; and with the new acquaintances Francisco Ruiz de Infante, Hanna Schwarz, and Andro Wekua.",
476
+ "I missed Brad Pitt, but there\u2019s nothing for the music critic to complain about this year.",
477
+ "Art is finding a complex way to marry music as its equal partner, and I\u2019m happy to dance at all their weddings\u2014the squeaky, the silent, and the somber."
478
+ ]
479
  }
480
  ],
481
  "user_kps": [
 
482
  "attentive user interfaces",
 
483
  "braille documents",
484
+ "causal inferences",
485
  "citation context analysis",
486
  "citation network",
487
+ "crowdsourced knowledge",
488
+ "crowdsourced tasks",
489
+ "e-discovery",
490
  "exploratory searches",
491
  "few-shot learning",
 
 
492
  "human readability",
493
  "human readers",
494
  "information accessibility",
495
  "literature-based discovery",
496
+ "micro-task markets",
497
+ "music structure analysis",
498
+ "music theory",
499
  "natural language generation",
500
+ "slave robots",
 
501
  "textual interface"
502
  ]
503
  }
data/users/jtomczak/embeds-jtomczak-doc.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0dc9459d825fbea858910d7f6d970e83f0e04e8b0215f6c0f77d178a592e88d3
3
+ size 522368
data/users/jtomczak/embeds-jtomczak-sent.pickle ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52deffdf417eb0f6d28e83c82abfc18148f415aecaf46f4ba24301631aebbfa1
3
+ size 1720822
data/users/jtomczak/pid2idx-jtomczak-doc.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49, "50": 50, "51": 51, "52": 52, "53": 53, "54": 54, "55": 55, "56": 56, "57": 57, "58": 58, "59": 59, "60": 60, "61": 61, "62": 62, "63": 63, "64": 64, "65": 65, "66": 66, "67": 67, "68": 68, "69": 69, "70": 70, "71": 71, "72": 72, "73": 73, "74": 74, "75": 75, "76": 76, "77": 77, "78": 78, "79": 79, "80": 80, "81": 81, "82": 82, "83": 83, "84": 84}
data/users/jtomczak/seedset-jtomczak-maple.json ADDED
The diff for this file is too large to render. See raw diff
 
data/users/lsoldaini/embeds-lsoldaini-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b7a1d82769935b80ea923550fd68d2043b0a8d846ff5d09e24dd658da3940676
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8b793ccbf308d88c3d9c2a03fa61874897e3561cc3337b1a52981e538d7c3472
3
+ size 295040
data/users/lsoldaini/embeds-lsoldaini-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8a1d83007aec324a4626250026ca5c0b32feb3ad5b737d41cc673c44e714032d
3
- size 403426
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d35ba5128fcdbe6b59caf92b75f235531d32b7b6ad50cec4a33ecef4ea05fd4a
3
+ size 994342
data/users/lsoldaini/pid2idx-lsoldaini-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47}
data/users/lsoldaini/seedset-lsoldaini-maple.json CHANGED
@@ -3,16 +3,39 @@
3
  "s2_authorid": "3328733",
4
  "papers": [
5
  {
6
- "title": "One-Shot Labeling for Automatic Relevance Estimation",
7
  "abstract": [
8
- "Dealing with unjudged documents (\"holes\") in relevance assessments is a perennial problem when evaluating search systems with offline experiments.",
9
- "Holes can reduce the apparent effectiveness of retrieval systems during evaluation and introduce biases in models trained with incomplete data.",
10
- "In this work, we explore whether large language models can help us fill such holes to improve offline evaluations.",
11
- "We examine an extreme, albeit common, evaluation setting wherein only a single known relevant document per query is available for evaluation.",
12
- "We then explore various approaches for predicting the relevance of unjudged documents with respect to a query and the known relevant document, including nearest neighbor, supervised, and prompting techniques.",
13
- "We find that although the predictions of these One-Shot Labelers (1SLs) frequently disagree with human assessments, the labels they produce yield a far more reliable ranking of systems than the single labels do alone.",
14
- "Specifically, the strongest approaches can consistently reach system ranking correlations of over 0.85 with the full rankings over a variety of measures.",
15
- "Meanwhile, the approach substantially reduces the false positive rate of t-tests due to holes in relevance assessments (from 15-30% down to under 5%), giving researchers more confidence in results they find to be significant."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ]
17
  },
18
  {
@@ -28,6 +51,91 @@
28
  "We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges."
29
  ]
30
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  {
32
  "title": "The Semantic Scholar Open Data Platform",
33
  "abstract": [
@@ -39,6 +147,42 @@
39
  "We will update this living document to reflect changes as we add new data offerings and improve existing services."
40
  ]
41
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  {
43
  "title": "Knowledge Transfer from Answer Ranking to Answer Generation",
44
  "abstract": [
@@ -58,10 +202,11 @@
58
  "Multi-document summarization (MDS) has traditionally been studied assuming a set of ground-truth topic-related input documents is provided.",
59
  "In practice, the input document set is unlikely to be available a priori and would need to be retrieved based on an information need, a setting we call open-domain MDS.",
60
  "We experiment with current state-of-the-art retrieval and summarization models on several popular MDS datasets extended to the open-domain setting.",
61
- "We find that existing summarizers suffer large reductions in performance when applied as-is to this more realistic task, though training summarizers with retrieved inputs can reduce their sensitivity retrieval errors.",
62
- "To further probe these findings, we conduct perturbation experiments on summarizer inputs to study the impact of different types of document retrieval errors.",
63
  "Based on our results, we provide practical guidelines to help facilitate a shift to open-domain MDS.",
64
- "We release our code and experimental results alongside all data or model artifacts created during our investigation."
 
65
  ]
66
  },
67
  {
@@ -74,15 +219,6 @@
74
  "Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks."
75
  ]
76
  },
77
- {
78
- "title": "Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection",
79
- "abstract": [
80
- "An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents.",
81
- "In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets.",
82
- "Specifically, the model is tasked to predict whether: (i) two sentences are extracted from the same paragraph, (ii) a given sentence is extracted from a given paragraph, and (iii) two paragraphs are extracted from the same document.",
83
- "Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models such as RoBERTa and ELECTRA for AS2."
84
- ]
85
- },
86
  {
87
  "title": "Scim: Intelligent Skimming Support for Scientific Papers",
88
  "abstract": [
@@ -94,6 +230,26 @@
94
  "We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools."
95
  ]
96
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  {
98
  "title": "Cross-Lingual G EN QA: Open-Domain Question Answering with Answer Sentence Generation",
99
  "abstract": [
@@ -126,7 +282,7 @@
126
  "While multiple ER techniques have been proposed, their practical effectiveness is still unknown because existing evaluations consider very few models and do not adequately account for overhead costs.",
127
  "We perform an extensive evaluation of ER across eight different models (17 to 900 million parameters) and fourteen tasks in English.",
128
  "We show how a simple ER technique that caches activations from an intermediate layer of a pretrained model, and learns task-specific adapters on the later layers, is broadly effective.",
129
- "For the best-performing baseline in our experiments (DeBERTa-v2 XL), adding a precomputed cache results in a>90% speedup during training and 87-91% speedup for inference, with negligible impact on accuracy.",
130
  "Our analysis reveals important areas of future work."
131
  ]
132
  },
@@ -232,28 +388,204 @@
232
  "Health experts also struggle to efficiently search the large amount of medical literature available to them, which impacts their ability of integrating the latest research findings in clinical practice.",
233
  "In this dissertation, I propose several methods to overcome these challenges, thus improving search outcomes."
234
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
235
  }
236
  ],
237
  "user_kps": [
238
  "attentive user interfaces",
239
- "bibliometric data",
240
- "citation network",
241
- "generating sentences",
242
- "google scholar",
243
  "human readers",
244
- "implicit relevance feedback",
245
- "information retrieval evaluation",
246
  "information retrieval models",
247
- "mobile readers",
 
 
248
  "multi-document summarization",
 
249
  "neural ranking models",
250
- "neural semantic parser",
251
  "question answering task",
252
  "question generation",
253
- "retrieval relevance",
254
- "scholarly communication",
255
  "sentence encoders",
256
- "sentence selection",
257
- "speech understanding"
 
258
  ]
259
  }
 
3
  "s2_authorid": "3328733",
4
  "papers": [
5
  {
6
+ "title": "AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters",
7
  "abstract": [
8
+ "Large language models' (LLMs) abilities are drawn from their pretraining data, and model development begins with data curation.",
9
+ "However, decisions around what data is retained or removed during this initial stage is under-scrutinized.",
10
+ "In our work, we ground web text, which is a popular pretraining data source, to its social and geographic contexts.",
11
+ "We create a new dataset of 10.3 million self-descriptions of website creators, and extract information about who they are and where they are from: their topical interests, social roles, and geographic affiliations.",
12
+ "Then, we conduct the first study investigating how ten\"quality\"and English language identification (langID) filters affect webpages that vary along these social dimensions.",
13
+ "Our experiments illuminate a range of implicit preferences in data curation: we show that some quality classifiers act like topical domain filters, and langID can overlook English content from some regions of the world.",
14
+ "Overall, we hope that our work will encourage a new line of research on pretraining data curation practices and its social implications."
15
+ ]
16
+ },
17
+ {
18
+ "title": "OLMo: Accelerating the Science of Language Models",
19
+ "abstract": [
20
+ "Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings.",
21
+ "As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed.",
22
+ "Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs.",
23
+ "To this end, this technical report details the first release of OLMo, a state-of-the-art, truly Open Language Model and its framework to build and study the science of language modeling.",
24
+ "Unlike most prior efforts that have only released model weights and inference code, we release OLMo and the whole framework, including training data and training and evaluation code.",
25
+ "We hope this release will empower and strengthen the open research community and inspire a new wave of innovation."
26
+ ]
27
+ },
28
+ {
29
+ "title": "Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research",
30
+ "abstract": [
31
+ "Language models have become a critical technology to tackling a wide range of natural language processing tasks, yet many details about how the best-performing language models were developed are not reported.",
32
+ "In particular, information about their pretraining corpora is seldom discussed: commercial language models rarely provide any information about their data; even open models rarely release datasets they are trained on, or an exact recipe to reproduce them.",
33
+ "As a result, it is challenging to conduct certain threads of language modeling research, such as understanding how training data impacts model capabilities and shapes their limitations.",
34
+ "To facilitate open research on language model pretraining, we release Dolma, a three trillion tokens English corpus, built from a diverse mixture of web content, scientific papers, code, public-domain books, social media, and encyclopedic materials.",
35
+ "In addition, we open source our data curation toolkit to enable further experimentation and reproduction of our work.",
36
+ "In this report, we document Dolma, including its design principles, details about its construction, and a summary of its contents.",
37
+ "We interleave this report with analyses and experimental results from training language models on intermediate states of Dolma to share what we have learned about important data curation practices, including the role of content or quality filters, deduplication, and multi-source mixing.",
38
+ "Dolma has been used to train OLMo, a state-of-the-art, open language model and framework designed to build and study the science of language modeling."
39
  ]
40
  },
41
  {
 
51
  "We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges."
52
  ]
53
  },
54
+ {
55
+ "title": "Paloma: A Benchmark for Evaluating Language Model Fit",
56
+ "abstract": [
57
+ "Language models (LMs) commonly report perplexity on monolithic data held out from training.",
58
+ "Implicitly or explicitly, this data is composed of domains$\\unicode{x2013}$varying distributions of language.",
59
+ "Rather than assuming perplexity on one distribution extrapolates to others, Perplexity Analysis for Language Model Assessment (Paloma), measures LM fit to 585 text domains, ranging from nytimes.com to r/depression on Reddit.",
60
+ "We invite submissions to our benchmark and organize results by comparability based on compliance with guidelines such as removal of benchmark contamination from pretraining.",
61
+ "Submissions can also record parameter and training token count to make comparisons of Pareto efficiency for performance as a function of these measures of cost.",
62
+ "We populate our benchmark with results from 6 baselines pretrained on popular corpora.",
63
+ "In case studies, we demonstrate analyses that are possible with Paloma, such as finding that pretraining without data beyond Common Crawl leads to inconsistent fit to many domains."
64
+ ]
65
+ },
66
+ {
67
+ "title": "One-Shot Labeling for Automatic Relevance Estimation",
68
+ "abstract": [
69
+ "Dealing with unjudged documents (\"holes\") in relevance assessments is a perennial problem when evaluating search systems with offline experiments.",
70
+ "Holes can reduce the apparent effectiveness of retrieval systems during evaluation and introduce biases in models trained with incomplete data.",
71
+ "In this work, we explore whether large language models can help us fill such holes to improve offline evaluations.",
72
+ "We examine an extreme, albeit common, evaluation setting wherein only a single known relevant document per query is available for evaluation.",
73
+ "We then explore various approaches for predicting the relevance of unjudged documents with respect to a query and the known relevant document, including nearest neighbor, supervised, and prompting techniques.",
74
+ "We find that although the predictions of these One-Shot Labelers (1SL) frequently disagree with human assessments, the labels they produce yield a far more reliable ranking of systems than the single labels do alone.",
75
+ "Specifically, the strongest approaches can consistently reach system ranking correlations of over 0.86 with the full rankings over a variety of measures.",
76
+ "Meanwhile, the approach substantially increases the reliability of t-tests due to filling holes in relevance assessments, giving researchers more confidence in results they find to be significant.",
77
+ "Alongside this work, we release an easy-to-use software package to enable the use of 1SL for evaluation of other ad-hoc collections or systems."
78
+ ]
79
+ },
80
+ {
81
+ "title": "The Surveillance AI Pipeline",
82
+ "abstract": [
83
+ "A rapidly growing number of voices argue that AI research, and computer vision in particular, is powering mass surveillance.",
84
+ "Yet the direct path from computer vision research to surveillance has remained obscured and difficult to assess.",
85
+ "Here, we reveal the Surveillance AI pipeline by analyzing three decades of computer vision research papers and downstream patents, more than 40,000 documents.",
86
+ "We find the large majority of annotated computer vision papers and patents self-report their technology enables extracting data about humans.",
87
+ "Moreover, the majority of these technologies specifically enable extracting data about human bodies and body parts.",
88
+ "We present both quantitative and rich qualitative analysis illuminating these practices of human data extraction.",
89
+ "Studying the roots of this pipeline, we find that institutions that prolifically produce computer vision research, namely elite universities and\"big tech\"corporations, are subsequently cited in thousands of surveillance patents.",
90
+ "Further, we find consistent evidence against the narrative that only these few rogue entities are contributing to surveillance.",
91
+ "Rather, we expose the fieldwide norm that when an institution, nation, or subfield authors computer vision papers with downstream patents, the majority of these papers are used in surveillance patents.",
92
+ "In total, we find the number of papers with downstream surveillance patents increased more than five-fold between the 1990s and the 2010s, with computer vision research now having been used in more than 11,000 surveillance patents.",
93
+ "Finally, in addition to the high levels of surveillance we find documented in computer vision papers and patents, we unearth pervasive patterns of documents using language that obfuscates the extent of surveillance.",
94
+ "Our analysis reveals the pipeline by which computer vision research has powered the ongoing expansion of surveillance."
95
+ ]
96
+ },
97
+ {
98
+ "title": "A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documents",
99
+ "abstract": [
100
+ "Many real-world applications (e.g., note taking, search) require extracting a sentence or paragraph from a document and showing that snippet to a human outside of the source document.",
101
+ "Yet, users may find snippets difficult to understand as they lack context from the original document.",
102
+ "In this work, we use language models to rewrite snippets from scientific documents to be read on their own.",
103
+ "First, we define the requirements and challenges for this user-facing decontextualization task, such as clarifying where edits occur and handling references to other documents.",
104
+ "Second, we propose a framework that decomposes the task into three stages: question generation, question answering, and rewriting.",
105
+ "Using this framework, we collect gold decontextualizations from experienced scientific article readers.",
106
+ "We then conduct a range of experiments across state-of-the-art commercial and open-source language models to identify how to best provide missing-but-relevant information to models for our task.",
107
+ "Finally, we develop QaDecontext, a simple prompting strategy inspired by our framework that improves over end-to-end prompting.",
108
+ "We conclude with analysis that finds, while rewriting is easy, question generation and answering remain challenging for today's models."
109
+ ]
110
+ },
111
+ {
112
+ "title": "Bound by the Bounty: Collaboratively Shaping Evaluation Processes for Queer AI Harms",
113
+ "abstract": [
114
+ "Bias evaluation benchmarks and dataset and model documentation have emerged as central processes for assessing the biases and harms of artificial intelligence (AI) systems.",
115
+ "However, these auditing processes have been criticized for their failure to integrate the knowledge of marginalized communities and consider the power dynamics between auditors and the communities.",
116
+ "Consequently, modes of bias evaluation have been proposed that engage impacted communities in identifying and assessing the harms of AI systems (e.g., bias bounties).",
117
+ "Even so, asking what marginalized communities want from such auditing processes has been neglected.",
118
+ "In this paper, we ask queer communities for their positions on, and desires from, auditing processes.",
119
+ "To this end, we organized a participatory workshop to critique and redesign bias bounties from queer perspectives.",
120
+ "We found that when given space, the scope of feedback from workshop participants goes far beyond what bias bounties afford, with participants questioning the ownership, incentives, and efficacy of bounties.",
121
+ "We conclude by advocating for community ownership of bounties and complementing bounties with participatory processes (e.g., co-creation)."
122
+ ]
123
+ },
124
+ {
125
+ "title": "Queer In AI: A Case Study in Community-Led Participatory AI",
126
+ "abstract": [
127
+ "Queerness and queer people face an uncertain future in the face of ever more widely deployed and invasive artificial intelligence (AI).",
128
+ "These technologies have caused numerous harms to queer people, including privacy violations, censoring and downranking queer content, exposing queer people and spaces to harassment by making them hypervisible, deadnaming and outing queer people.",
129
+ "More broadly, they have violated core tenets of queerness by classifying and controlling queer identities.",
130
+ "In response to this, the queer community in AI has organized Queer in AI, a global, decentralized, volunteer-run grassroots organization that employs intersectional and community-led participatory design to build an inclusive and equitable AI future.",
131
+ "In this paper, we present Queer in AI as a case study for community-led participatory design in AI.",
132
+ "We examine how participatory design and intersectional tenets started and shaped this community\u2019s programs over the years.",
133
+ "We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess the organization\u2019s impact.",
134
+ "Queer in AI provides important lessons and insights for practitioners and theorists of participatory methods broadly through its rejection of hierarchy in favor of decentralization, success at building aid and programs by and for the queer community, and effort to change actors and institutions outside of the queer community.",
135
+ "Finally, we theorize how communities like Queer in AI contribute to the participatory design in AI more broadly by fostering cultures of participation in AI, welcoming and empowering marginalized participants, critiquing poor or exploitative participatory practices, and bringing participation to institutions outside of individual research projects.",
136
+ "Queer in AI\u2019s work serves as a case study of grassroots activism and participatory methods within AI, demonstrating the potential of community-led participatory methods and intersectional praxis, while also providing challenges, case studies, and nuanced insights to researchers developing and using participatory methods."
137
+ ]
138
+ },
139
  {
140
  "title": "The Semantic Scholar Open Data Platform",
141
  "abstract": [
 
147
  "We will update this living document to reflect changes as we add new data offerings and improve existing services."
148
  ]
149
  },
150
+ {
151
+ "title": "When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets",
152
+ "abstract": [
153
+ "Using large language models (LMs) for query or document expansion can improve generalization in information retrieval.",
154
+ "However, it is unknown whether these techniques are universally beneficial or only effective in specific settings, such as for particular retrieval models, dataset domains, or query types.",
155
+ "To answer this, we conduct the first comprehensive analysis of LM-based expansion.",
156
+ "We find that there exists a strong negative correlation between retriever performance and gains from expansion: expansion improves scores for weaker models, but generally harms stronger models.",
157
+ "We show this trend holds across a set of eleven expansion techniques, twelve datasets with diverse distribution shifts, and twenty-four retrieval models.",
158
+ "Through qualitative error analysis, we hypothesize that although expansions provide extra information (potentially improving recall), they add additional noise that makes it difficult to discern between the top relevant documents (thus introducing false positives).",
159
+ "Our results suggest the following recipe: use expansions for weaker models or when the target dataset significantly differs from training corpus in format; otherwise, avoid expansions to keep the relevance signal clear."
160
+ ]
161
+ },
162
+ {
163
+ "title": "What's In My Big Data?",
164
+ "abstract": [
165
+ "Large text corpora are the backbone of language models.",
166
+ "However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion of evaluation data (contamination).",
167
+ "In this work, we propose What's In My Big Data? (",
168
+ "WIMBD), a platform and a set of sixteen analyses that allow us to reveal and compare the contents of large text corpora.",
169
+ "WIMBD builds on two basic capabilities -- count and search -- at scale, which allows us to analyze more than 35 terabytes on a standard compute node.",
170
+ "We apply WIMBD to ten different corpora used to train popular language models, including C4, The Pile, and RedPajama.",
171
+ "Our analysis uncovers several surprising and previously undocumented findings about these corpora, including the high prevalence of duplicate, synthetic, and low-quality content, personally identifiable information, toxic language, and benchmark contamination.",
172
+ "For instance, we find that about 50% of the documents in RedPajama and LAION-2B-en are duplicates.",
173
+ "In addition, several datasets used for benchmarking models trained on such corpora are contaminated with respect to important benchmarks, including the Winograd Schema Challenge and parts of GLUE and SuperGLUE.",
174
+ "We open-source WIMBD's code and artifacts to provide a standard set of evaluations for new text-based corpora and to encourage more analyses and transparency around them: github.com/allenai/wimbd."
175
+ ]
176
+ },
177
+ {
178
+ "title": "Overview of the TREC 2022 NeuCLIR Track",
179
+ "abstract": [
180
+ "This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to study the impact of neural approaches to cross-language information retrieval.",
181
+ "The main task in this year's track was ad hoc ranked retrieval of Chinese, Persian, or Russian newswire documents using queries expressed in English.",
182
+ "Topics were developed using standard TREC processes, except that topics developed by an annotator for one language were assessed by a different annotator when evaluating that topic on a different language.",
183
+ "There were 172 total runs submitted by twelve teams."
184
+ ]
185
+ },
186
  {
187
  "title": "Knowledge Transfer from Answer Ranking to Answer Generation",
188
  "abstract": [
 
202
  "Multi-document summarization (MDS) has traditionally been studied assuming a set of ground-truth topic-related input documents is provided.",
203
  "In practice, the input document set is unlikely to be available a priori and would need to be retrieved based on an information need, a setting we call open-domain MDS.",
204
  "We experiment with current state-of-the-art retrieval and summarization models on several popular MDS datasets extended to the open-domain setting.",
205
+ "We \ufb01nd that existing summarizers suffer large reductions in performance when applied as-is to this more realistic task, though training summarizers with retrieved inputs can reduce their sensitivity retrieval errors.",
206
+ "To further probe these \ufb01ndings, we conduct perturbation experiments on summarizer inputs to study the impact of different types of document retrieval errors.",
207
  "Based on our results, we provide practical guidelines to help facilitate a shift to open-domain MDS.",
208
+ "We release our code and experimental results alongside all data or model artifacts created during our investigation.",
209
+ "1"
210
  ]
211
  },
212
  {
 
219
  "Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks."
220
  ]
221
  },
 
 
 
 
 
 
 
 
 
222
  {
223
  "title": "Scim: Intelligent Skimming Support for Scientific Papers",
224
  "abstract": [
 
230
  "We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools."
231
  ]
232
  },
233
+ {
234
+ "title": "Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval",
235
+ "abstract": [
236
+ "Multi-document summarization (MDS) assumes a set of topic-related documents are provided as input.",
237
+ "In practice, this document set is not always available; it would need to be retrieved given an information need, i.e. a question or topic statement, a setting we dub\"open-domain\"MDS.",
238
+ "We study this more challenging setting by formalizing the task and bootstrapping it using existing datasets, retrievers and summarizers.",
239
+ "Via extensive automatic and human evaluation, we determine: (1) state-of-the-art summarizers suffer large reductions in performance when applied to open-domain MDS, (2) additional training in the open-domain setting can reduce this sensitivity to imperfect retrieval, and (3) summarizers are insensitive to the retrieval of duplicate documents and the order of retrieved documents, but highly sensitive to other errors, like the retrieval of irrelevant documents.",
240
+ "Based on our results, we provide practical guidelines to enable future work on open-domain MDS, e.g. how to choose the number of retrieved documents to summarize.",
241
+ "Our results suggest that new retrieval and summarization methods and annotated resources for training and evaluation are necessary for further progress in the open-domain setting."
242
+ ]
243
+ },
244
+ {
245
+ "title": "Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection",
246
+ "abstract": [
247
+ "An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents.",
248
+ "In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets.",
249
+ "Specifically, the model is tasked to predict whether: (i) two sentences are extracted from the same paragraph, (ii) a given sentence is extracted from a given paragraph, and (iii) two paragraphs are extracted from the same document.",
250
+ "Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models such as RoBERTa and ELECTRA for AS2."
251
+ ]
252
+ },
253
  {
254
  "title": "Cross-Lingual G EN QA: Open-Domain Question Answering with Answer Sentence Generation",
255
  "abstract": [
 
282
  "While multiple ER techniques have been proposed, their practical effectiveness is still unknown because existing evaluations consider very few models and do not adequately account for overhead costs.",
283
  "We perform an extensive evaluation of ER across eight different models (17 to 900 million parameters) and fourteen tasks in English.",
284
  "We show how a simple ER technique that caches activations from an intermediate layer of a pretrained model, and learns task-specific adapters on the later layers, is broadly effective.",
285
+ "For the best-performing baseline in our experiments (DeBERTa-v2 XL), adding a precomputed cache results in a 90% speedup during training and 87-91% speedup for inference, with negligible impact on accuracy.",
286
  "Our analysis reveals important areas of future work."
287
  ]
288
  },
 
388
  "Health experts also struggle to efficiently search the large amount of medical literature available to them, which impacts their ability of integrating the latest research findings in clinical practice.",
389
  "In this dissertation, I propose several methods to overcome these challenges, thus improving search outcomes."
390
  ]
391
+ },
392
+ {
393
+ "title": "Relation Extraction for Protein-protein Interactions Affected by Mutations",
394
+ "abstract": [
395
+ "Precision Medicine has attracted increasing attention from biomedical research.",
396
+ "Extracting information from biomedical literature about protein-protein interactions affected by mutations is a vital step towards PM because it uncovers mechanisms leading to diseases.",
397
+ "We investigate a feature-rich supervised method to accomplish this relation extraction challenge.",
398
+ "Our approach leverages a novel combination of features, as well as two auxiliary corpora, to achieve up 44% improvement in F1-score over baseline method."
399
+ ]
400
+ },
401
+ {
402
+ "title": "GU IRLAB at SemEval-2018 Task 7: Tree-LSTMs for Scientific Relation Classification",
403
+ "abstract": [
404
+ "SemEval 2018 Task 7 focuses on relation extraction and classification in scientific literature.",
405
+ "In this work, we present our tree-based LSTM network for this shared task.",
406
+ "Our approach placed 9th (of 28) for subtask 1.1 (relation classification), and 5th (of 20) for subtask 1.2 (relation classification with noisy entities).",
407
+ "We also provide an ablation study of features included as input to the network."
408
+ ]
409
+ },
410
+ {
411
+ "title": "Characterizing Question Facets for Complex Answer Retrieval",
412
+ "abstract": [
413
+ "Complex answer retrieval (CAR) is the process of retrieving answers to questions that have multifaceted or nuanced answers.",
414
+ "In this work, we present two novel approaches for CAR based on the observation that question facets can vary in utility: from structural (facets that can apply to many similar topics, such as 'History') to topical (facets that are specific to the question's topic, such as the 'Westward expansion' of the United States).",
415
+ "We first explore a way to incorporate facet utility into ranking models during query term score combination.",
416
+ "We then explore a general approach to reform the structure of ranking models to aid in learning of facet utility in the query-document term matching phase.",
417
+ "When we use our techniques with a leading neural ranker on the TREC CAR dataset, our methods yield statistically significant improvements over both an unmodified neural architecture and submitted TREC runs."
418
+ ]
419
+ },
420
+ {
421
+ "title": "SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions",
422
+ "abstract": [
423
+ "Mental health is a significant and growing public health concern.",
424
+ "As language usage can be leveraged to obtain crucial insights into mental health conditions, there is a need for large-scale, labeled, mental health-related datasets of users who have been diagnosed with one or more of such conditions.",
425
+ "In this paper, we investigate the creation of high-precision patterns to identify self-reported diagnoses of nine different mental health conditions, and obtain high-quality labeled data without the need for manual labelling.",
426
+ "We introduce the SMHD (Self-reported Mental Health Diagnoses) dataset and make it available.",
427
+ "SMHD is a novel large dataset of social media posts from users with one or multiple mental health conditions along with matched control users.",
428
+ "We examine distinctions in users\u2019 language, as measured by linguistic and psychological variables.",
429
+ "We further explore text classification methods to identify individuals with mental conditions through their language."
430
+ ]
431
+ },
432
+ {
433
+ "title": "RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses",
434
+ "abstract": [
435
+ "Self-reported diagnosis statements have been widely employed in studying language related to mental health in social media.",
436
+ "However, existing research has largely ignored the temporality of mental health diagnoses.",
437
+ "In this work, we introduce RSDD-Time: a new dataset of 598 manually annotated self-reported depression diagnosis posts from Reddit that include temporal information about the diagnosis.",
438
+ "Annotations include whether a mental health condition is present and how recently the diagnosis happened.",
439
+ "Furthermore, we include exact temporal spans that relate to the date of diagnosis.",
440
+ "This information is valuable for various computational methods to examine mental health through social media because one\u2019s mental health state is not static.",
441
+ "We also test several baseline classification and extraction approaches, which suggest that extracting temporal information from self-reported diagnosis statements is challenging."
442
+ ]
443
+ },
444
+ {
445
+ "title": "Helping or Hurting? Predicting Changes in Users\u2019 Risk of Self-Harm Through Online Community Interactions",
446
+ "abstract": [
447
+ "In recent years, online communities have formed around suicide and self-harm prevention.",
448
+ "While these communities offer support in moment of crisis, they can also normalize harmful behavior, discourage professional treatment, and instigate suicidal ideation.",
449
+ "In this work, we focus on how interaction with others in such a community affects the mental state of users who are seeking support.",
450
+ "We first build a dataset of conversation threads between users in a distressed state and community members offering support.",
451
+ "We then show how to construct a classifier to predict whether distressed users are helped or harmed by the interactions in the thread, and we achieve a macro-F1 score of up to 0.69."
452
+ ]
453
+ },
454
+ {
455
+ "title": "Denoising Clinical Notes for Medical Literature Retrieval with Convolutional Neural Model",
456
+ "abstract": [
457
+ "The rapid increase of medical literature poses a significant challenge for physicians, who have repeatedly reported to struggle to keep up to date with developments in research.",
458
+ "This gap is one of the main challenges in integrating recent advances in clinical research with day-to-day practice.",
459
+ "Thus, the need for clinical decision support (CDS) search systems that can retrieve highly relevant medical literature given a clinical note describing a patient has emerged.",
460
+ "However, clinical notes are inherently noisy, thus not being fit to be used as queries as-is.",
461
+ "In this work, we present a convolutional neural model aimed at improving clinical notes representation, making them suitable for document retrieval.",
462
+ "The system is designed to predict, for each clinical note term, its importance in relevant documents.",
463
+ "The approach was evaluated on the 2016 TREC CDS dataset, where it achieved a 37% improvement in infNDCG over state-of-the-art query reduction methods and a 27% improvement over the best known method for the task."
464
+ ]
465
+ },
466
+ {
467
+ "title": "Learning to reformulate long queries for clinical decision support",
468
+ "abstract": [
469
+ "The large volume of biomedical literature poses a serious problem for medical professionals, who are often struggling to keep current with it.",
470
+ "At the same time, many health providers consider knowledge of the latest literature in their field a key component for successful clinical practice.",
471
+ "In this work, we introduce two systems designed to help retrieving medical literature.",
472
+ "Both receive a long, discursive clinical note as input query, and return highly relevant literature that could be used in support of clinical practice.",
473
+ "The first system is an improved version of a method previously proposed by the authors; it combines pseudo relevance feedback and a domain\u2010specific term filter to reformulate the query.",
474
+ "The second is an approach that uses a deep neural network to reformulate a clinical note.",
475
+ "Both approaches were evaluated on the 2014 and 2015 TREC CDS datasets; in our tests, they outperform the previously proposed method by up to 28% in inferred NDCG; furthermore, they are competitive with the state of the art, achieving up to 8% improvement in inferred NDCG."
476
+ ]
477
+ },
478
+ {
479
+ "title": "Team GU-IRLAB at CLEF eHealth 2016: Task 3",
480
+ "abstract": [
481
+ "Recent surveys have shown that a growing number internet users seek medical help online.",
482
+ "Yet, recent research [12] has shown that many commercial search engine still struggle in completely satisfying the information need of users.",
483
+ "In this work, we present a study on the use of medical terms for query reformulation.",
484
+ "We use synonyms and hypernyms from a large medical ontology to generate alternative formulations for a query; Results obtained by the reformulated queries are fused using the Borda rank aggregation algorithm."
485
+ ]
486
+ },
487
+ {
488
+ "title": "Identifying Significance of Discrepancies in Radiology Reports",
489
+ "abstract": [
490
+ "At many teaching hospitals, it is common practice for on-call radiology residents to interpret radiology examinations; such reports are later reviewed and revised by an attending physician before being used for any decision making.",
491
+ "In case there are substantial problems in the resident\u2019s initial report, the resident is called and the problems are reviewed to prevent similar future reporting errors.",
492
+ "However, due to the large volume of reports produced, attending physicians rarely discuss the problems side by side with residents, thus missing an educational opportunity.",
493
+ "In this work, we introduce a pipeline to discriminate between reports with significant discrepancies and those with non-significant discrepancies.",
494
+ "The former contain severe errors or mis-interpretations, thus representing a great learning opportunity for the resident; the latter presents only minor differences (often stylistic) and have a minor role in the education of a resident.",
495
+ "By discriminating between the two, the proposed system could flag those reports that an attending radiology should definitely review with residents under their supervision.",
496
+ "We evaluated our approach on 350 manually annotated radiology reports sampled from a collection of tens of thousands.",
497
+ "The proposed classifier achieves an Area Under the Curve (AUC) of 0.837, which represent a 14% improvement over the baselines.",
498
+ "Furthermore, the classifier reduces the False Negative Rate (FNR) by 52%, a desirable performance metric for any recall-oriented task such as the one studied"
499
+ ]
500
+ },
501
+ {
502
+ "title": "Inferring Individual Attributes from Search Engine Queries and Auxiliary Information",
503
+ "abstract": [
504
+ "Internet data has surfaced as a primary source for investigation of different aspects of human behavior.",
505
+ "A crucial step in such studies is finding a suitable cohort (i.e., a set of users) that shares a common trait of interest to researchers.",
506
+ "However, direct identification of users sharing this trait is often impossible, as the data available to researchers is usually anonymized to preserve user privacy.",
507
+ "To facilitate research on specific topics of interest, especially in medicine, we introduce an algorithm for identifying a trait of interest in anonymous users.",
508
+ "We illustrate how a small set of labeled examples, together with statistical information about the entire population, can be aggregated to obtain labels on unseen examples.",
509
+ "We validate our approach using labeled data from the political domain.",
510
+ "We provide two applications of the proposed algorithm to the medical domain.",
511
+ "In the first, we demonstrate how to identify users whose search patterns indicate they might be suffering from certain types of cancer.",
512
+ "This shows, for the first time, that search queries can be used as a screening device for diseases that are currently often discovered too late, because no early screening tests exists.",
513
+ "In the second, we detail an algorithm to predict the distribution of diseases given their incidence in a subset of the population at study, making it possible to predict disease spread from partial epidemiological data."
514
+ ]
515
+ },
516
+ {
517
+ "title": "QuickUMLS: a fast, unsupervised approach for medical concept extraction",
518
+ "abstract": [
519
+ "Entity extraction is a fundamental step in many health in-formatics systems.",
520
+ "In recent years, tools such as MetaMap and cTAKES have been widely used for medical concept extraction on medical literature and clinical notes; however, relatively little interest has been placed on their scalability to large datasets.",
521
+ "In this work, we present QuickUMLS: a fast, unsupervised, approximate dictionary matching algo-rithm for medical concept extraction.",
522
+ "The proposed method achieves similar precision and recall of state-of-the-art systems on two clinical notes corpora, and outperforms MetaMap and cTAKES on a dataset of consumer drug reviews.",
523
+ "More importantly, it is up to 135 times faster than both systems."
524
+ ]
525
+ },
526
+ {
527
+ "title": "Towards Citation-Based Summarization of Biomedical Literature",
528
+ "abstract": [
529
+ "Citation-based summarization is a form of technical summarization that uses citations to an article to form its summary.",
530
+ "In biomedical literature, citations by themselves are not reliable to be used for summary as they fail to consider the context of the findings in the referenced article.",
531
+ "One way to remedy such problem is to link citations to the related text spans in the reference article.",
532
+ "The ultimate goal in TAC1 biomedical summarization track is to generate a citation-based summary, using both the citations and the context information.",
533
+ "This paper describes our approach for finding the context information related to each citation and determining their discourse facet (Task 1 of the track).",
534
+ "We approach this task as a search task, applying different query reformulation techniques for retrieving the relevant text spans.",
535
+ "After finding the relevant spans, we classify each citation to a set of discourse facets to capture the structure of the referenced paper.",
536
+ "While our results show 20% improvement over the baseline, the efficiency of the system still leaves much room for improvement."
537
+ ]
538
+ },
539
+ {
540
+ "title": "Matching Citation Text and Cited Spans in Biomedical Literature: a Search-Oriented Approach",
541
+ "abstract": [
542
+ "Citation sentences (citances) to a reference article have been extensively studied for summarization tasks.",
543
+ "However, citances might not accurately represent the content of the cited article, as they often fail to capture the context of the reported findings and can be affected by epistemic value drift.",
544
+ "Following the intuition behind the TAC (Text Analysis Conference) 2014 Biomedical Summarization track, we propose a system that identifies text spans in the reference article that are related to a given citance.",
545
+ "We refer to this problem as citance-reference spans matching.",
546
+ "We approach the problem as a retrieval task; in this paper, we detail a comparison of different citance reformulation methods and their combinations.",
547
+ "While our results show improvement over the baseline (up to 25.9%), their absolute magnitude implies that there is ample room for future improvement."
548
+ ]
549
+ },
550
+ {
551
+ "title": "On clinical decision support",
552
+ "abstract": [
553
+ "Recent interest in search tools for Clinical Decision Support (CDS) has dramatically increased.",
554
+ "These tools help clinicians assess a medical situation by providing actionable information in the form of a select few highly relevant recent medical papers.",
555
+ "Unlike traditional search, which is designed to deal with short queries, queries in CDS are long and narrative.",
556
+ "We investigate the utility of applying pseudo-relevance feedback (PRF), a query expansion method that performs well in keyword-based medical literature search to CDS search.",
557
+ "Using the optimum combination of PRF parameters we obtained statistically significant retrieval efficiency improvement in terms of nDCG, over the baseline."
558
+ ]
559
+ },
560
+ {
561
+ "title": "Query Reformulation for Clinical Decision Support Search",
562
+ "abstract": [
563
+ "Abstract : One of the tasks a Clinical Decision Support (CDS) system is designed to solve is retrieving the most relevant and actionable literature for a given medical case report.",
564
+ "In this work, we present a query reformulation approach that addresses the unique formulation of case reports, making them suitable to be used on a general purpose search engine.",
565
+ "Furthermore, we introduce five reranking algorithms designed to re-order a list of retrieved literature to better match the type of information needed for each case report."
566
+ ]
567
  }
568
  ],
569
  "user_kps": [
570
  "attentive user interfaces",
571
+ "biomedical literature mining",
572
+ "citation context analysis",
573
+ "cross-language information retrieval",
574
+ "hl7 's clinical document architecture",
575
  "human readers",
 
 
576
  "information retrieval models",
577
+ "language model performance",
578
+ "language models",
579
+ "machine reading comprehension",
580
  "multi-document summarization",
581
+ "n-gram language models",
582
  "neural ranking models",
583
+ "pre-trained word vectors",
584
  "question answering task",
585
  "question generation",
 
 
586
  "sentence encoders",
587
+ "speech understanding",
588
+ "surveillance",
589
+ "user accountability"
590
  ]
591
  }
data/users/nmahyar/embeds-nmahyar-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2d259ea66ddf9a268987e29b78fe63a0daf51f02046dad00977cef55182847b0
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2bf33eb937fe52acedf3ed15fa1be55aacf692dc256c8d6184eadcdb891895ca
3
+ size 307328
data/users/nmahyar/embeds-nmahyar-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:610ecd3fa38d3b8bdb0ba9bb2895aa56b86f47432b11324746ddaee87de98d26
3
- size 455650
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c7da71b708d649eda55758feb1f310772d0bffdf0f735acc6647f7f225bdc042
3
+ size 1148020
data/users/nmahyar/pid2idx-nmahyar-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49}
data/users/nmahyar/seedset-nmahyar-maple.json CHANGED
@@ -3,14 +3,86 @@
3
  "s2_authorid": "1936892",
4
  "papers": [
5
  {
6
- "title": "Supporting Serendipitous Discovery and Balanced Analysis of Online Product Reviews with Interaction-Driven Metrics and Bias-Mitigating Suggestions",
7
  "abstract": [
8
- "In this study, we investigate how supporting serendipitous discovery and analysis of online product reviews can encourage readers to explore reviews more comprehensively prior to making purchase decisions.",
9
- "We propose two interventions \u2014 Exploration Metrics that can help readers understand and track their exploration patterns through visual indicators and a Bias Mitigation Model that intends to maximize knowledge discovery by suggesting sentiment and semantically diverse reviews.",
10
- "We designed, developed, and evaluated a text analytics system called Serendyze, where we integrated these interventions.",
11
- "We asked 100 crowd workers to use Serendyze to make purchase decisions based on product reviews.",
12
- "Our evaluation suggests that exploration metrics enabled readers to efficiently cover more reviews in a balanced way, and suggestions from the bias mitigation model influenced readers to make confident data-driven decisions.",
13
- "We discuss the role of user agency and trust in text-level analysis systems and their applicability in domains beyond review exploration."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ]
15
  },
16
  {
@@ -53,21 +125,6 @@
53
  "We conclude by suggesting that these dimensions may be useful for visualization design across a variety of application domains, beyond civic text visualization."
54
  ]
55
  },
56
- {
57
- "title": "Designing With Pictographs: Envision Topics Without Sacrificing Understanding",
58
- "abstract": [
59
- "Past studies have shown that when a visualization uses pictographs to encode data, they have a positive effect on memory, engagement, and assessment of risk.",
60
- "However, little is known about how pictographs affect one\u2019s ability to understand a visualization, beyond memory for values and trends.",
61
- "We conducted two crowdsourced experiments to compare the effectiveness of using pictographs when showing part-to-whole relationships.",
62
- "In Experiment 1, we compared pictograph arrays to more traditional bar and pie charts.",
63
- "We tested participants\u2019 ability to generate high-level insights following Bloom\u2019s taxonomy of educational objectives via 6 free-response questions.",
64
- "We found that accuracy for extracting information and generating insights did not differ overall between the two versions.",
65
- "To explore the motivating differences between the designs, we conducted a second experiment where participants compared charts containing pictograph arrays to more traditional charts on 5 metrics and explained their reasoning.",
66
- "We found that some participants preferred the way that pictographs allowed them to envision the topic more easily, while others preferred traditional bar and pie charts because they seem less cluttered and faster to read.",
67
- "These results suggest that, at least in simple visualizations depicting part-to-whole relationships, the choice of using pictographs has little influence on sensemaking and insight extraction.",
68
- "When deciding whether to use pictograph arrays, designers should consider visual appeal, perceived comprehension time, ease of envisioning the topic, and clutteredness."
69
- ]
70
- },
71
  {
72
  "title": "An Interdisciplinary Perspective on Evaluation and Experimental Design for Visual Text Analytics: Position Paper",
73
  "abstract": [
@@ -92,6 +149,47 @@
92
  "Such results suggest possible new directions for text analytics in other query-oriented settings."
93
  ]
94
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  {
96
  "title": "CommunityPulse: Facilitating Community Input Analysis by Surfacing Hidden Insights, Reflections, and Priorities",
97
  "abstract": [
@@ -126,16 +224,18 @@
126
  ]
127
  },
128
  {
129
- "title": "Impact of the COVID-19 Pandemic on the Academic Community Results from a survey conducted at University of Massachusetts Amherst",
130
  "abstract": [
131
- "The COVID-19 pandemic has significantly impacted academic life in the United States and beyond.",
132
- "To gain a better understanding of its impact on the academic community, we conducted a large-scale survey at the University of Massachusetts Amherst.",
133
- "We collected multifaceted data from students, staff, and faculty on several aspects of their lives, such as mental and physical health, productivity, and finances.",
134
- "All our respondents expressed mental and physical issues and concerns, such as increased stress and depression levels.",
135
- "Financial difficulties seem to have the most considerable toll on staff and undergraduate students, while productivity challenges were mostly expressed by faculty and graduate students.",
136
- "As universities face many important decisions with respect to mitigating the effects of this pandemic, we present our findings with the intent of shedding light on the challenges faced by various academic groups in the face of the pandemic, calling attention to the differences between groups.",
137
- "We also contribute a discussion highlighting how the results translate to policies for the effective and timely support of the categories of respondents who need them most.",
138
- "Finally, the survey itself, which includes conditional logic allowing for personalized questions, serves as a template for further data collection, facilitating a comparison of the impact on campuses across the United States."
 
 
139
  ]
140
  },
141
  {
@@ -170,9 +270,9 @@
170
  "Therefore, novice users who lack technical expertise face hurdles in perusing and analyzing data.",
171
  "Existing tools assist in formulating queries through keyword search, query recommendation, and query auto-completion, but still require some technical expertise.",
172
  "An alternative method for accessing data is Query by Example (QBE), where users express their data exploration intent simply by providing examples of their intended data.",
173
- "We study a state-of-the-art QBE system called SQUID, and contrast it with traditional SQL querying.",
174
- "Our comparative user studies demonstrate that users with varying expertise are significantly more effective and efficient with SQUID than SQL.",
175
- "We find that SQUID eliminates the barriers in studying the database schema, formalizing task semantics, and writing syntactically correct SQL queries, and thus, substantially alleviates the need for technical expertise in data exploration."
176
  ]
177
  },
178
  {
@@ -209,7 +309,7 @@
209
  ]
210
  },
211
  {
212
- "title": "CommunityClick: Capturing and Reporting Community Feedback from Town Halls to Improve Inclusivity",
213
  "abstract": [
214
  "Local governments still depend on traditional town halls for community consultation, despite problems such as a lack of inclusive participation for attendees and difficulty for civic organizers to capture attendees' feedback in reports.",
215
  "Building on a formative study with 66 town hall attendees and 20 organizers, we designed and developed CommunityClick, a communitysourcing system that captures attendees' feedback in an inclusive manner and enables organizers to author more comprehensive reports.",
@@ -249,28 +349,303 @@
249
  "Significant findings include (1) different engagement patterns of stroke patients in game-assisted therapy, (2) imperative roles of therapists in moderating games and challenges that therapists face during game-assisted therapy, and (3) lack of support for therapists in delivering patient-centered, personalized therapy to individual stroke patients.",
250
  "Furthermore, we discuss design implications for more effective rehabilitation game therapies that take into consideration both patients and therapists and their specific needs."
251
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
252
  }
253
  ],
254
  "user_kps": [
255
  "bibliometric mapping",
 
256
  "community context",
257
- "computer-assisted qualitative data analysis software",
258
- "exploratory queries",
 
259
  "human-centered design",
260
  "impacted communities",
261
- "international students",
262
  "participatory design",
263
- "qualitative survey",
264
  "rehabilitation gaming system",
 
265
  "socio-emotional support",
266
  "text analytics",
267
  "text visualization",
268
- "text-based analysis",
269
- "university students",
270
  "user exploration",
271
  "user participation",
272
- "user reviews",
273
  "visual ambiguities",
 
274
  "visualization literacy"
275
  ]
276
  }
 
3
  "s2_authorid": "1936892",
4
  "papers": [
5
  {
6
+ "title": "Building Community Resiliency through Immersive Communal Extended Reality (CXR)",
7
  "abstract": [
8
+ "Situated and shared experiences can motivate community members to plan shared action, promoting community engagement.",
9
+ "We deployed and evaluated a communal extended-reality (CXR) bus tour that depicts the possible impacts of flooding and climate change.",
10
+ "This paper describes the results of seven community engagement sessions with a total of N = 74 members of the Roosevelt Island community.",
11
+ "We conducted pre- and post-bus tour focus groups to understand how the tour affected these community members\u2019 awareness and motivation to take action.",
12
+ "We found that the unique qualities of immersive, situated, and geo-located virtual reality (VR) on a bus made climate change feel real, brought the consequences of climate change closer to home, and highlighted existing community resources to address the issue.",
13
+ "Our results showed that the CXR experience helped to simulate a physical emergency state, which empowered the community to translate feelings of hopelessness into creative and actionable ideas.",
14
+ "Our finding exemplifies that geo-located VR on a bus can be a powerful tool to motivate innovations and collective action.",
15
+ "Our work is a first-of-its-kind empirical contribution showing that CXR experiences can inspire action.",
16
+ "It offers a proof-of-concept of a large-scale community engagement process featuring simulated communal experiences, leading to creative ideas for a bottom-up community resiliency plan."
17
+ ]
18
+ },
19
+ {
20
+ "title": "From Information to Choice: A Critical Inquiry Into Visualization Tools for Decision Making",
21
+ "abstract": [
22
+ "In the face of complex decisions, people often engage in a three-stage process that spans from (1) exploring and analyzing pertinent information (intelligence); (2) generating and exploring alternative options (design); and ultimately culminating in (3) selecting the optimal decision by evaluating discerning criteria (choice).",
23
+ "We can fairly assume that all good visualizations aid in the \u201cintelligence\u201d stage by enabling data exploration and analysis.",
24
+ "Yet, to what degree and how do visualization systems currently support the other decision making stages, namely \u201cdesign\u201d and \u201cchoice\u201d?",
25
+ "To further explore this question, we conducted a comprehensive review of decision-focused visualization tools by examining publications in major visualization journals and conferences, including VIS, EuroVis, and CHI, spanning all available years.",
26
+ "We employed a deductive coding method and in-depth analysis to assess whether and how visualization tools support design and choice.",
27
+ "Specifically, we examined each visualization tool by (i) its degree of visibility for displaying decision alternatives, criteria, and preferences, and (ii) its degree of flexibility for offering means to manipulate the decision alternatives, criteria, and preferences with interactions such as adding, modifying, changing mapping, and filtering.",
28
+ "Our review highlights the opportunities and challenges that decision-focused visualization tools face in realizing their full potential to support all stages of the decision making process.",
29
+ "It reveals a surprising scarcity of tools that support all stages, and while most tools excel in offering visibility for decision criteria and alternatives, the degree of flexibility to manipulate these elements is often limited, and the lack of tools that accommodate decision preferences and their elicitation is notable.",
30
+ "Based on our findings, to better support the choice stage, future research could explore enhancing flexibility levels and variety, exploring novel visualization paradigms, increasing algorithmic support, and ensuring that this automation is user-controlled via the enhanced flexibility I evels.",
31
+ "Our curated list of the 88 surveyed visualization tools is available in the OSF link (https://osf.io/nrasz/?view_only=b92a90a34ae241449b5f2cd33383bfcb)."
32
+ ]
33
+ },
34
+ {
35
+ "title": "CommunityBots: Creating and Evaluating A Multi-Agent Chatbot Platform for Public Input Elicitation",
36
+ "abstract": [
37
+ "In recent years, the popularity of AI-enabled conversational agents or chatbots has risen as an alternative to traditional online surveys to elicit information from people.",
38
+ "However, there is a gap in using single-agent chatbots to converse and gather multi-faceted information across a wide variety of topics.",
39
+ "Prior works suggest that single-agent chatbots struggle to understand user intentions and interpret human language during a multi-faceted conversation.",
40
+ "In this work, we investigated how multi-agent chatbot systems can be utilized to conduct a multi-faceted conversation across multiple domains.",
41
+ "To that end, we conducted a Wizard of Oz study to investigate the design of a multi-agent chatbot for gathering public input across multiple high-level domains and their associated topics.",
42
+ "Next, we designed, developed, and evaluated CommunityBots - a multi-agent chatbot platform where each chatbot handles a different domain individually.",
43
+ "To manage conversation across multiple topics and chatbots, we proposed a novel Conversation and Topic Management (CTM) mechanism that handles topic-switching and chatbot-switching based on user responses and intentions.",
44
+ "We conducted a between-subject study comparing CommunityBots to a single-agent chatbot baseline with 96 crowd workers.",
45
+ "The results from our evaluation demonstrate that CommunityBots participants were significantly more engaged, provided higher quality responses, and experienced fewer conversation interruptions while conversing with multiple different chatbots in the same session.",
46
+ "We also found that the visual cues integrated with the interface helped the participants better understand the functionalities of the CTM mechanism, which enabled them to perceive changes in textual conversation, leading to better user satisfaction.",
47
+ "Based on the empirical insights from our study, we discuss future research avenues for multi-agent chatbot design and its application for rich information elicitation."
48
+ ]
49
+ },
50
+ {
51
+ "title": "How Data Scientists Review the Scholarly Literature",
52
+ "abstract": [
53
+ "Keeping up with the research literature plays an important role in the workflow of scientists \u2013 allowing them to understand a field, formulate the problems they focus on, and develop the solutions that they contribute, which in turn shape the nature of the discipline.",
54
+ "In this paper, we examine the literature review practices of data scientists.",
55
+ "Data science represents a field seeing an exponential rise in papers, and increasingly drawing on and being applied in numerous diverse disciplines.",
56
+ "Recent efforts have seen the development of several tools intended to help data scientists cope with a deluge of research and coordinated efforts to develop AI tools intended to uncover the research frontier.",
57
+ "Despite these trends indicative of the information overload faced by data scientists, no prior work has examined the specific practices and challenges faced by these scientists in an interdisciplinary field with evolving scholarly norms.",
58
+ "In this paper, we close this gap through a set of semi-structured interviews and think-aloud protocols of industry and academic data scientists (N = 20).",
59
+ "Our results while corroborating other knowledge workers\u2019 practices uncover several novel findings: individuals (1) are challenged in seeking and sensemaking of papers beyond their disciplinary bubbles, (2) struggle to understand papers in the face of missing details and mathematical content, (3) grapple with the deluge by leveraging the knowledge context in code, blogs, and talks, and (4) lean on their peers online and in-person.",
60
+ "Furthermore, we outline future directions likely to help data scientists cope with the burgeoning research literature."
61
+ ]
62
+ },
63
+ {
64
+ "title": "Who Do We Mean When We Talk About Visualization Novices?",
65
+ "abstract": [
66
+ "As more people rely on visualization to inform their personal and collective decisions, researchers have focused on a broader range of audiences, including \u201cnovices.\u201d",
67
+ "But successfully applying, interrogating, or advancing visualization research for novices demands a clear understanding of what \u201cnovice\u201d means in theory and practice.",
68
+ "Misinterpreting who a \u201cnovice\u201d is could lead to misapplying guidelines and overgeneralizing results.",
69
+ "In this paper, we investigated how visualization researchers define novices and how they evaluate visualizations intended for novices.",
70
+ "We analyzed 79 visualization papers that used \u201cnovice,\u201d \u201cnon-expert,\u201d \u201claypeople,\u201d or \u201cgeneral public\u201d in their titles or abstracts.",
71
+ "We found ambiguity within papers and disagreement between papers regarding what defines a novice.",
72
+ "Furthermore, we found a mismatch between the broad language describing novices and the narrow population representing them in evaluations (i.e., young people, students, and US residents).",
73
+ "We suggest directions for inclusively supporting novices in both theory and practice."
74
+ ]
75
+ },
76
+ {
77
+ "title": "Bridging the Divide: Promoting Serendipitous Discovery of Opposing Viewpoints with Visual Analytics in Social Media",
78
+ "abstract": [
79
+ "While social media promises open access to information, prior works suggest that it also plays a role as a catalyst for the social divide, which is often attributed to a shift towards algorithmic content curation based on users' digital footprints.",
80
+ "To combat this issue, methods that support serendipity have received attention in recent years that aim to provide information beyond a user's viewpoint or preference.",
81
+ "However, the utility of systems that promote serendipity in raising awareness of opposing viewpoints remains underexplored, especially in the political context.",
82
+ "To that end, we conducted a study where we asked 14 participants to explore tweets about two politically charged topics - gun control and immigration - using an interaction-driven visual analytics tool that visualizes users' exploration patterns and provides serendipitous suggestions from opposing viewpoints.",
83
+ "We found that as participants explored the tweets, they gradually became aware of opposing viewpoints and identified information they had not considered before which helped them gain knowledge about arguments from all sides.",
84
+ "We also found while people were keen to use technology that promotes serendipity to cover more topical information, they do not necessarily trust the information found on social media.",
85
+ "We hope that our work will motivate future researchers to investigate serendipitous aspects in visualizations to promote a more holistic exploration of various viewpoints."
86
  ]
87
  },
88
  {
 
125
  "We conclude by suggesting that these dimensions may be useful for visualization design across a variety of application domains, beyond civic text visualization."
126
  ]
127
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
  {
129
  "title": "An Interdisciplinary Perspective on Evaluation and Experimental Design for Visual Text Analytics: Position Paper",
130
  "abstract": [
 
149
  "Such results suggest possible new directions for text analytics in other query-oriented settings."
150
  ]
151
  },
152
+ {
153
+ "title": "Supporting Serendipitous Discovery and Balanced Analysis of Online Product Reviews with Interaction-Driven Metrics and Bias-Mitigating Suggestions",
154
+ "abstract": [
155
+ "In this study, we investigate how supporting serendipitous discovery and analysis of online product reviews can encourage readers to explore reviews more comprehensively prior to making purchase decisions.",
156
+ "We propose two interventions \u2014 Exploration Metrics that can help readers understand and track their exploration patterns through visual indicators and a Bias Mitigation Model that intends to maximize knowledge discovery by suggesting sentiment and semantically diverse reviews.",
157
+ "We designed, developed, and evaluated a text analytics system called Serendyze, where we integrated these interventions.",
158
+ "We asked 100 crowd workers to use Serendyze to make purchase decisions based on product reviews.",
159
+ "Our evaluation suggests that exploration metrics enabled readers to efficiently cover more reviews in a balanced way, and suggestions from the bias mitigation model influenced readers to make confident data-driven decisions.",
160
+ "We discuss the role of user agency and trust in text-level analysis systems and their applicability in domains beyond review exploration."
161
+ ]
162
+ },
163
+ {
164
+ "title": "Supporting Serendipitous Discovery and Balanced Analysis of Unstructured Text with Interaction-Driven Metrics and Bias-Mitigating Suggestions",
165
+ "abstract": [
166
+ "In this study, we investigate how supporting serendipitous discovery and analysis of short free-form texts, such as product reviews can encourage readers to explore texts more comprehensively prior to decision-making.",
167
+ "We propose two interventions \u2014 Exploration Metrics that help readers understand and track their exploration patterns through visual indicators and a Bias Mitigation Model that maximizes knowledge discovery by suggesting readers sentiment and semantically diverse reviews.",
168
+ "We designed, developed, and evaluated a text analytics system called Serendyze, where we integrated these interventions.",
169
+ "We asked 100 crowd workers to use Serendyze to make purchase decisions based on product reviews.",
170
+ "Our evaluation suggests that exploration metrics enable readers to efficiently cover more reviews in a balanced way, and suggestions from the bias mitigation model influence readers to make confident data-driven decisions.",
171
+ "We discuss the role of user agency and trust in text-level analysis systems and their applicability in domains beyond review exploration.",
172
+ "CCS Concepts: \u2022 Human-centered computing \u2192 Human computer interaction (HCI) .",
173
+ "learning, and visual analytics.",
174
+ "We extend the call to future researchers to investigate these questions and devise solutions to how the dichotomy between user agency and trust in mixed-initiative systems can be balanced."
175
+ ]
176
+ },
177
+ {
178
+ "title": "From Invisible to Visible: Impacts of Metadata in Communicative Data Visualization.",
179
+ "abstract": [
180
+ "Leaving the context of visualizations invisible can have negative impacts on understanding and transparency.",
181
+ "While common wisdom suggests that recontextualizing visualizations with metadata (e.g., disclosing the data source or instructions for decoding the visualizations' encoding) may counter these effects, the impact remains largely unknown.",
182
+ "To fill this gap, we conducted two experiments.",
183
+ "In Experiment 1, we explored how chart type, topic, and user goal impacted which categories of metadata participants deemed most relevant.",
184
+ "We presented 64 participants with four real-world visualizations.",
185
+ "For each visualization, participants were given four goals and selected the type of metadata they most wanted from a set of 18 types.",
186
+ "Our results indicated that participants were most interested in metadata which explained the visualization's encoding for goals related to understanding and metadata about the source of the data for assessing trustworthiness.",
187
+ "In Experiment 2, we explored how these two types of metadata impact transparency, trustworthiness and persuasiveness, information relevance, and understanding.",
188
+ "We asked 144 participants to explain the main message of two pairs of visualizations (one with metadata and one without); rate them on scales of transparency and relevance; and then predict the likelihood that they were selected for a presentation to policymakers.",
189
+ "Our results suggested that visualizations with metadata were perceived as more thorough than those without metadata, but similarly relevant, accurate, clear, and complete.",
190
+ "Additionally, we found that metadata did not impact the accuracy of the information extracted from visualizations, but may have influenced which information participants remembered as important or interesting."
191
+ ]
192
+ },
193
  {
194
  "title": "CommunityPulse: Facilitating Community Input Analysis by Surfacing Hidden Insights, Reflections, and Priorities",
195
  "abstract": [
 
224
  ]
225
  },
226
  {
227
+ "title": "Designing With Pictographs: Envision Topics Without Sacrificing Understanding",
228
  "abstract": [
229
+ "Past studies have shown that when a visualization uses pictographs to encode data, they have a positive effect on memory, engagement, and assessment of risk.",
230
+ "However, little is known about how pictographs affect one\u2019s ability to understand a visualization, beyond memory for values and trends.",
231
+ "We conducted two crowdsourced experiments to compare the effectiveness of using pictographs when showing part-to-whole relationships.",
232
+ "In Experiment 1, we compared pictograph arrays to more traditional bar and pie charts.",
233
+ "We tested participants\u2019 ability to generate high-level insights following Bloom\u2019s taxonomy of educational objectives via 6 free-response questions.",
234
+ "We found that accuracy for extracting information and generating insights did not differ overall between the two versions.",
235
+ "To explore the motivating differences between the designs, we conducted a second experiment where participants compared charts containing pictograph arrays to more traditional charts on 5 metrics and explained their reasoning.",
236
+ "We found that some participants preferred the way that pictographs allowed them to envision the topic more easily, while others preferred traditional bar and pie charts because they seem less cluttered and faster to read.",
237
+ "These results suggest that, at least in simple visualizations depicting part-to-whole relationships, the choice of using pictographs has little influence on sensemaking and insight extraction.",
238
+ "When deciding whether to use pictograph arrays, designers should consider visual appeal, perceived comprehension time, ease of envisioning the topic, and clutteredness."
239
  ]
240
  },
241
  {
 
270
  "Therefore, novice users who lack technical expertise face hurdles in perusing and analyzing data.",
271
  "Existing tools assist in formulating queries through keyword search, query recommendation, and query auto-completion, but still require some technical expertise.",
272
  "An alternative method for accessing data is Query by Example (QBE), where users express their data exploration intent simply by providing examples of their intended data.",
273
+ "We study a state-of-the-art QBE system called SQuID, and contrast it with traditional SQL querying.",
274
+ "Our comparative user studies demonstrate that users with varying expertise are significantly more effective and efficient with SQuID than SQL.",
275
+ "We find that SQuID eliminates the barriers in studying the database schema, formalizing task semantics, and writing syntactically correct SQL queries, and thus, substantially alleviates the need for technical expertise in data exploration."
276
  ]
277
  },
278
  {
 
309
  ]
310
  },
311
  {
312
+ "title": "CommunityClick",
313
  "abstract": [
314
  "Local governments still depend on traditional town halls for community consultation, despite problems such as a lack of inclusive participation for attendees and difficulty for civic organizers to capture attendees' feedback in reports.",
315
  "Building on a formative study with 66 town hall attendees and 20 organizers, we designed and developed CommunityClick, a communitysourcing system that captures attendees' feedback in an inclusive manner and enables organizers to author more comprehensive reports.",
 
349
  "Significant findings include (1) different engagement patterns of stroke patients in game-assisted therapy, (2) imperative roles of therapists in moderating games and challenges that therapists face during game-assisted therapy, and (3) lack of support for therapists in delivering patient-centered, personalized therapy to individual stroke patients.",
350
  "Furthermore, we discuss design implications for more effective rehabilitation game therapies that take into consideration both patients and therapists and their specific needs."
351
  ]
352
+ },
353
+ {
354
+ "title": "Impact of the COVID-19 Pandemic on the Academic Community Results from a survey conducted at University of Massachusetts Amherst",
355
+ "abstract": [
356
+ "The COVID-19 pandemic has significantly impacted academic life in the United States and beyond.",
357
+ "To gain a better understanding of its impact on the academic community, we conducted a large-scale survey at the University of Massachusetts Amherst.",
358
+ "We collected multifaceted data from students, staff, and faculty on several aspects of their lives, such as mental and physical health, productivity, and finances.",
359
+ "All our respondents expressed mental and physical issues and concerns, such as increased stress and depression levels.",
360
+ "Financial difficulties seem to have the most considerable toll on staff and undergraduate students, while productivity challenges were mostly expressed by faculty and graduate students.",
361
+ "As universities face many important decisions with respect to mitigating the effects of this pandemic, we present our findings with the intent of shedding light on the challenges faced by various academic groups in the face of the pandemic, calling attention to the differences between groups.",
362
+ "We also contribute a discussion highlighting how the results translate to policies for the effective and timely support of the categories of respondents who need them most.",
363
+ "Finally, the survey itself, which includes conditional logic allowing for personalized questions, serves as a template for further data collection, facilitating a comparison of the impact on campuses across the United States."
364
+ ]
365
+ },
366
+ {
367
+ "title": "The Civic Data Deluge: Understanding the Challenges of Analyzing Large-Scale Community Input",
368
+ "abstract": [
369
+ "Advancements in digital civics have enabled leaders to engage and gather input from a broader spectrum of the public.",
370
+ "However, less is known about the analysis process around community input and the challenges faced by civic leaders as engagement practices scale up.",
371
+ "To understand these challenges, we conducted 21 interviews with leaders on civic-oriented projects.",
372
+ "We found that at a small-scale, civic leaders manage to facilitate sensemaking through collaborative or individual approaches.",
373
+ "However, as civic leaders scale engagement practices to account for more diverse perspectives, making sense of the large quantity of qualitative data becomes a challenge.",
374
+ "Civic leaders could benefit from training in qualitative data analysis and simple, scalable collaborative analysis tools that would help the community form a shared understanding.",
375
+ "Drawing from these insights, we discuss opportunities for designing tools that could improve civic leaders' ability to utilize and reflect public input in decisions."
376
+ ]
377
+ },
378
+ {
379
+ "title": "VAST Paper Reviewers",
380
+ "abstract": [
381
+ "Presents a listing of VSAT conference reviewers."
382
+ ]
383
+ },
384
+ {
385
+ "title": "Beyond the clash: investigating BIM-based building design coordination issue representation and resolution",
386
+ "abstract": [
387
+ "Successful management of the building design coordination process is critical to the efficient delivery of cost-effective and quality projects.",
388
+ "Building information modeling (BIM) has had a significant impact on design coordination, supporting the identification and management of \u2018clashes\u2019 between building systems.",
389
+ "However, many design coordination issues go beyond the traditional definition of a \u2018clash\u2019 and either go undetected or require further time, resources, and expertise to resolve.",
390
+ "The goal of this research is to better understand the causes of coordination issues and the factors that affect their resolution.",
391
+ "Specifically, we developed a taxonomy of design coordination issues and an ontology that defines the relationships between physical, process, and model-based design issues.",
392
+ "We applied the taxonomy to two case studies and analyzed the frequency of issue types, the distribution of issue types across disciplines, and the resolution rates of issue types.",
393
+ "We found that the most frequent causes of design coordination issues were design discrepancy, design error, clashes and missing items.",
394
+ "The most common design coordination issue across both case studies was design error.",
395
+ "The temporal and functional design issues took the longest time to resolve and missing information took the least amount of time.",
396
+ "Design discrepancies were least likely to be resolved by the end of design coordination.",
397
+ "The taxonomy was validated through inter-coder reliability testing.",
398
+ "The experts we interviewed confirmed that the taxonomy of coordination issues could improve design coordination processes, particularly in the issue identification stage prior to communicating the issue with the team."
399
+ ]
400
+ },
401
+ {
402
+ "title": "CommunityCrit: Inviting the Public to Improve and Evaluate Urban Design Ideas through Micro-Activities",
403
+ "abstract": [
404
+ "While urban design affects the public, most people do not have the time or expertise to participate in the process.",
405
+ "Many online tools solicit public input, yet typically limit interaction to collecting complaints or early-stage ideas.",
406
+ "This paper explores how to engage the public in more complex stages of urban design without requiring a significant time commitment.",
407
+ "After observing workshops, we designed a system called CommunityCrit that offers micro-activities to engage communities in elaborating and evaluating urban design ideas.",
408
+ "Through a four-week deployment, in partnership with a local planning group seeking to redesign a street intersection, CommunityCrit yielded 352 contributions (around 10 minutes per participant).",
409
+ "The planning group reported that CommunityCrit provided insights on public perspectives and raised awareness for their project, but noted the importance of setting expectations for the process.",
410
+ "People appreciated that the system provided a window into the planning process, empowered them to contribute, and supported diverse levels of skills and availability."
411
+ ]
412
+ },
413
+ {
414
+ "title": "Visualizing Dimension Coverage to Support Exploratory Analysis",
415
+ "abstract": [
416
+ "Data analysis involves constantly formulating and testing new hypotheses and questions about data.",
417
+ "When dealing with a new dataset, especially one with many dimensions, it can be cumbersome for the analyst to clearly remember which aspects of the data have been investigated (i.e., visually examined for patterns, trends, outliers etc.)",
418
+ "and which combinations have not.",
419
+ "Yet this information is critical to help the analyst formulate new questions that they have not already answered.",
420
+ "We observe that for tabular data, questions are typically comprised of varying combinations of data dimensions (e.g., what are the trends of Sales and Profit for different Regions?).",
421
+ "We propose representing analysis history from the angle of dimension coverage (i.e., which data dimensions have been investigated and in which combinations).",
422
+ "We use scented widgets to incorporate dimension coverage of the analysts' past work into interaction widgets of a visualization tool.",
423
+ "We demonstrate how this approach can assist analysts with the question formation process.",
424
+ "Our approach extends the concept of scented widgets to reveal aspects of one's own analysis history, and offers a different perspective on one's past work than typical visualization history tools.",
425
+ "Results of our empirical study showed that participants with access to embedded dimension coverage information relied on this information when formulating questions, asked more questions about the data, generated more top-level findings, and showed greater breadth of their analysis without sacrificing depth."
426
+ ]
427
+ },
428
+ {
429
+ "title": "ConsensUs: Visualizing Points of Disagreement for Multi-Criteria Collaborative Decision Making",
430
+ "abstract": [
431
+ "Groups often face dif\ufb01culty reaching consensus.",
432
+ "For complex decisions with multiple latent criteria, discourse alone may impede groups from pinpointing fundamental disagreements.",
433
+ "To help support a consensus building process, we introduce ConsensUs, a novel visualization tool that highlights disagreements in comparative decisions.",
434
+ "The tool facilitates groups to specify comparison criteria and to quantify their subjective opinions across these criteria.",
435
+ "Consen-sUs then highlights salient differences between members.",
436
+ "An evaluation with 87 participants shows that ConsensUs helps individuals identify points of disagreement within groups and leads people to align their scores more with the group opinion.",
437
+ "We discuss the larger design space for supporting the group consensus process, and our future directions to extend this approach to large-scale decision making platforms."
438
+ ]
439
+ },
440
+ {
441
+ "title": "Proceedings of the Sixth Workshop on Beyond Time and Errors on Novel Evaluation Methods for Visualization, BELIV 2016, Baltimore, MD, USA, October 24, 2016",
442
+ "abstract": [
443
+ "Visualization has shown its ability to produce powerful tools for analyzing, understanding, and communicating data and making it accessible for several different tasks and purposes.",
444
+ "Impact of visualization to everyday work and personal lives is demonstrated by many successes stories---such as the increasing prevalence of Tableau, the interactive visualizations produced by the New York Times, or toolkits like VTK/Paraview to name just a few.",
445
+ "A large community of casual and professional users are increasingly consuming and producing both interactive and static visualizations."
446
+ ]
447
+ },
448
+ {
449
+ "title": "A Multi-Display Environment for Community Planning",
450
+ "abstract": [
451
+ "We describe an integrated, tabletop-centered multi-display environment for engaging the public in collaborative community planning.",
452
+ "Activities are anchored in a multi-touch tabletop display augmented with multiple large-screen displays and smaller hand-held displays to foster collaborative co-creation.",
453
+ "To make the tool accessible for non-technical users, we provide familiar visualizations and intuitive interactions.",
454
+ "We discuss the latest version of our user-centered iterative design, the different roles of the three types of displays, and preliminary results of an observational study."
455
+ ]
456
+ },
457
+ {
458
+ "title": "Enabling Crowdsourced Visualizations to Support Large-Scale Civic Engagement",
459
+ "abstract": [
460
+ "Cities and local municipalities have become petri dishes for exploring new strategies for engaging the public in massive decision-making processes to address some of the highly complex and controversial challenges such as climate change [e.g.1-7].",
461
+ "These platforms are important for at least two reasons: First, they democratize access to important decision-making processes.",
462
+ "Second, they promise to improve the pace and quality of change by leveraging the immense diversity of perspectives and experiences of tens of thousands of contributors.",
463
+ "In addition, the diverse knowledge of community members can led to more innovative solutions.",
464
+ "While these civic engagement platforms typically succeed at collecting large numbers of opinions from citizens, they often offer little support for making sense of the thousands of ideas and comments contributed by the community.",
465
+ "Furthermore, they do not help the public pinpoint key disagreements among stakeholders or innovate negotiated solutions.",
466
+ "Our goal is to design a transparent sensemaking approach that not only help policymakers to make sense of thousands of contributions, but also empower citizens to understand different viewpoints, and identify point of disagreements to enable stakeholders to collaboratively co-create alternative solutions that attempt to bridge diverse viewpoints."
467
+ ]
468
+ },
469
+ {
470
+ "title": "Proceedings of the Sixth Workshop on \u201cBeyond Time and Errors:Novel Evaluation Methods for Visualization\u201d (BELIV 2016, October 24, Baltimore, Maryland, USA)",
471
+ "abstract": [
472
+ "BELIV 2016 is the sixth instance of the bi-annual BELIV workshop series, and takes place on Otober 24, 2016, as a one-day \nworkshop at IEEE VIS Conference 2016 in Baltimore, Maryland, USA.",
473
+ "This book contains the proceedings of the event."
474
+ ]
475
+ },
476
+ {
477
+ "title": "Just Scratching the Surface , the Long Road to Effective Cross-Display Interaction",
478
+ "abstract": [
479
+ "Copyright is held by the author/owner(s).",
480
+ "Presented at the Cross-Surface \u201916 workshop, in conjunction with ACM ISS\u201916.",
481
+ "November 6, Niagara Falls, Canada.",
482
+ "Abstract There are five issues that face designers of systems that support cross-surface interactions \u201cin the wild.\u201d",
483
+ "These present unique challenges to successfully deploying multiple displays that fully exploit surface technologies and the rich interactions they afford: (1) the form factor of a display often determines its appropriate role in a multi-surface environment; (2) placement rules, replication, and presentation format for content that is shared across surfaces can have complex semantics that need careful design to be effective; (3) the physical and logical topology of linked surfaces impacts how cross-surface interaction will be controlled; (4) the rapid convergence of computer graphics, computer vision, and haptic input and output are opening up vast new possibilities that were only imaginable a few years ago; and (5) the desire to make these new technologies accessible to a widely diverse set of stakeholders makes all of the issues that much more challenging.",
484
+ "We illustrate our discussion through examples drawn from our own work supporting collaborative urban design for sustainable cities."
485
+ ]
486
+ },
487
+ {
488
+ "title": "UD Co-Spaces: A Table-Centred Multi-Display Environment for Public Engagement in Urban Design Charrettes",
489
+ "abstract": [
490
+ "UD Co-Spaces (Urban Design Collaborative Spaces) is an integrated, tabletop-centered multi-display environment for engaging the public in the complex process of collaborative urban design.",
491
+ "We describe the iterative user-centered process that we followed over six years through a close interdisciplinary collaboration involving experts in urban design and neighbourhood planning.",
492
+ "Versions of UD Co-Spaces were deployed in five real-world charrettes (planning workshops) with 83 participants, a heuristic evaluation with three domain experts, and a qualitative laboratory study with 37 participants.",
493
+ "We reflect on our design decisions and how multi-display environments can engage a broad range of stakeholders in decision making and foster collaboration and co-creation within urban design.",
494
+ "We examine the parallel use of different displays, each with tailored interactive visualizations, and whether this affects what people can learn about the consequences of their choices for sustainable neighborhoods.",
495
+ "We assess UD Co-Spaces using seven principles for collaborative urban design tools that we identified based on literature in urban design, CSCW, and public engagement."
496
+ ]
497
+ },
498
+ {
499
+ "title": "Towards a Taxonomy for Evaluating User Engagement in Information Visualization",
500
+ "abstract": [
501
+ "Nowadays, with the availability of massive amounts of personal data, the role of Information Visualization is getting more and more important.",
502
+ "There are a lot of visualization and visual analytics tools to help people to make sense of their personal data in their everyday lives.",
503
+ "There is a consensus in the visualization community about the importance of user engagement for personal visual analytics.",
504
+ "However, there are many challenges associated with this topic.",
505
+ "As of yet, there is no clear and widely accepted definition for user engagement.",
506
+ "Consequently and despite some recent efforts, there are no systematic and unified methods for evaluating different aspects of user engagement.",
507
+ "In this paper, we bring attention to some fundamental open issues in the context of Information Visualization: What is the definition of engagement?",
508
+ "What are the levels of engagement?",
509
+ "How to measure engagement?",
510
+ "How to improve user engagement level?",
511
+ "This research is an initial attempt towards addressing some of these issues.",
512
+ "Our main goal is to enable visualization researchers to more accurately measure and evaluate user engagement with visualizations.",
513
+ "To this end, we reviewed definitions, measures, and frameworks from various other disciplines and specified the gap in the visualization community.",
514
+ "We propose a five level taxonomy for engagement which is deeply inspired by Boloom\u2019s Taxonomy.",
515
+ "In addition, we address some important open questions that require further exploration.",
516
+ "This paper establishes a groundwork for future research in user engagement with visualizations."
517
+ ]
518
+ },
519
+ {
520
+ "title": "Supporting Sensemaking during Collocated Collaborative Visual Analytics",
521
+ "abstract": [
522
+ "Sensemaking (i.e. the process of deriving meaning from complex information to make decisions) is often cited as an important and challenging activity for collaborative technology.",
523
+ "A key element to the success of collaborative sensemaking is effective coordination and communication within the team.",
524
+ "It requires team members to divide the task load, communicate findings and discuss the results.",
525
+ "Sensemaking is one of the human activities involved in visual analytics (i.e. the science of analytical reasoning facilitated by interactive visual interfaces).",
526
+ "The inherent complexity of the sensemaking process imposes many challenges for designers.",
527
+ "Therefore, providing effective tool support for collaborative sensemaking is a multifaceted and complex problem.",
528
+ "Such tools should provide support for visualization as well as communication and coordination.",
529
+ "Analysts need to organize their findings, hypotheses, and evidence, share that information with their collaborators, and coordinate work activities amongst members of the team.",
530
+ "Sharing externalizations (i.e. any information related to the course of analysis such as insights, hypotheses,"
531
+ ]
532
+ },
533
+ {
534
+ "title": "Supporting Communication and Coordination in Collaborative Sensemaking",
535
+ "abstract": [
536
+ "When people work together to analyze a data set, they need to organize their findings, hypotheses, and evidence, share that information with their collaborators, and coordinate activities amongst team members.",
537
+ "Sharing externalizations (recorded information such as notes) could increase awareness and assist with team communication and coordination.",
538
+ "However, we currently know little about how to provide tool support for this sort of sharing.",
539
+ "We explore how linked common work (LCW) can be employed within a `collaborative thinking space', to facilitate synchronous collaborative sensemaking activities in Visual Analytics (VA).",
540
+ "Collaborative thinking spaces provide an environment for analysts to record, organize, share and connect externalizations.",
541
+ "Our tool, CLIP, extends earlier thinking spaces by integrating LCW features that reveal relationships between collaborators' findings.",
542
+ "We conducted a user study comparing CLIP to a baseline version without LCW.",
543
+ "Results demonstrated that LCW significantly improved analytic outcomes at a collaborative intelligence task.",
544
+ "Groups using CLIP were also able to more effectively coordinate their work, and held more discussion of their findings and hypotheses.",
545
+ "LCW enabled them to maintain awareness of each other's activities and findings and link those findings to their own work, preventing disruptive oral awareness notifications."
546
+ ]
547
+ },
548
+ {
549
+ "title": "Observations of Record-Keeping in Co-located Collaborative Analysis",
550
+ "abstract": [
551
+ "Record-keeping is known to facilitate visual data analysis in single user and asynchronous collaborative settings.",
552
+ "We Implemented Co Spaces, a tool for collaborative visual data analysis with a record-keeping mechanism that enables tracking of analysis history.",
553
+ "Then we conducted an observational study with ten pairs analyzing a sales dataset, to study how collaborators use visual record-keeping during co-located work on a tabletop.",
554
+ "We report actions on visual record-keeping and inferred key user intentions for each action.",
555
+ "Actions and intentions varied depending on the analytical phase and collaboration style.",
556
+ "Based on our findings, we suggest providing various views of recorded material, showing manually saved rather than automatically saved items by default, enabling people to review collaboratorsa\u0302 work unobtrusively and automatically recommending items related to a usera\u0302s analytical task."
557
+ ]
558
+ },
559
+ {
560
+ "title": "Note-taking in co-located collaborative visual analytics: Analysis of an observational study",
561
+ "abstract": [
562
+ "In an observational study, we noticed that record-keeping plays a critical role in the overall process of collaborative visual data analysis.",
563
+ "Record-keeping involves recording material for later use, ranging from data about the visual analysis processes and visualization states to notes and annotations that externalize user insights, findings, and hypotheses.",
564
+ "In our study, co-located teams worked on collaborative visual analytics tasks using large interactive wall and tabletop displays.",
565
+ "Part of our findings is a collaborative data analysis framework that encompasses record-keeping as one of the main activities.",
566
+ "In this paper, our primary focus is on note-taking activity.",
567
+ "Based on our observations, we characterize notes according to their content, scope, and usage, and describe how they fit into a process of collaborative data analysis.",
568
+ "We then discuss suggestions to improve the design of note-taking functionality for co-located collaborative visual analytics tools."
569
+ ]
570
+ },
571
+ {
572
+ "title": "CoSpaces : Workspaces to Support Co-located Collaborative Visual Analytics",
573
+ "abstract": [
574
+ "We introduce CoSpaces, a system designed for co-located collaborative Visual Analytics on large interactive surfaces.",
575
+ "A core design idea within CoSpaces is the use of tab-based portals to access to other work areas, supporting awareness.",
576
+ "Underlying the tabs is a record-keeping mechanism that enables tracking of analysis history and note taking; such records are useful not only for managing individual analysis activities, but also for maintaining awareness of other users\u2019 activities.",
577
+ "A usability study of CoSpaces suggests that these new design ideas can effectively support group analysis tasks."
578
+ ]
579
+ },
580
+ {
581
+ "title": "Supporting note taking in co-located collaborative visual analytics on large interactive surfaces",
582
+ "abstract": [
583
+ "My research examines how to support note taking in colocated collaborative visual data analysis.",
584
+ "My preliminary observational study revealed the importance of note taking as one of the main analytical processes.",
585
+ "This finding motivated me to further investigate note taking in the context of co-located collaborative visual analytics.",
586
+ "I participated in designing and implementing CoSpaces, a tool specifically tailored for collaborative visual data analysis on tabletop displays.",
587
+ "This tool provides a framework for collaborative data analysis, in which note taking mechanisms can be studied.",
588
+ "Initially a simple note taking mechanism involving text notes recorded via an on-screen keyboard was implemented.",
589
+ "However, a usability study found this to be insufficient.",
590
+ "Because of my observation that users frequently used the automatically created links between notes and visualizations to access more information, I aim to investigate the effects of semi-automatic note taking mechanisms built into a collaborative visual analysis tool.",
591
+ "I am planning to provide analysts with editable note-templates populated with information related to the current line of inquiry.",
592
+ "I hypothesize that note-templates could improve the collaboration process by improving the structure of notes for group use.",
593
+ "Evaluation will be done through qualitative user studies.",
594
+ "Findings of this research will inform the design of future collaborative tools for visual analysis of data."
595
+ ]
596
+ },
597
+ {
598
+ "title": "On Two Desiderata for Creativity Support Tools",
599
+ "abstract": [
600
+ "This paper discusses two important desiderata for developing creativity support tools, namely ideation and empowerment.",
601
+ "We then use them to guide us in designing a new individual creativity support tool codenamed Creative-Pad.",
602
+ "Creative-Pad is designed to assist individual advertising creative to develop creative ideas for advertisements.",
603
+ "For ideation, Creative-Pad searches and filters information automatically from the internet to present to the user with related words and exemplar sentences.",
604
+ "For empowerment, CreativePad is designed in such a way that the user is neither distracted nor burdened to do any other tasks unrelated to conjuring up a creative idea for a new advertisement.",
605
+ "Creative-Pad is fully implemented and some preliminary results of its use by advertising creatives are reported."
606
+ ]
607
+ },
608
+ {
609
+ "title": "A closer look at note taking in the co-located collaborative visual analytics process",
610
+ "abstract": [
611
+ "This paper highlights the important role that record-keeping (i.e. taking notes and saving charts) plays in collaborative data analysis within the business domain.",
612
+ "The discussion of record-keeping is based on observations from a user study in which co-located teams worked on collaborative visual analytics tasks using large interactive wall and tabletop displays.",
613
+ "Part of our findings is a collaborative data analysis framework that encompasses note taking as one of the main activities.",
614
+ "We observed that record-keeping was a critical activity within the analysis process.",
615
+ "Based on our observations, we characterize notes according to their content, scope, and usage, and describe how they fit into a process of collaborative data analysis.",
616
+ "We then discuss suggestions for the design of collaborative visual analytics tools."
617
+ ]
618
+ },
619
+ {
620
+ "title": "History Tools for Collaborative Visualization",
621
+ "abstract": [
622
+ "In the context of collaborative data visualization and analysis, history tools can play an important role.",
623
+ "We present a compilation that characterizes users\u02bc probable objectives when using history tools for collaborative work, as well as operations commonly performed on histories.",
624
+ "We further characterize user objectives according to the likely time/space setting in which they would be used, and whether they are likely to be used by individuals, groups, or both.",
625
+ "We conclude by compiling a list of design and implementation challenges and research questions that need to be discussed and investigated in order to make history tools adequately support collaborative visualization activities."
626
+ ]
627
  }
628
  ],
629
  "user_kps": [
630
  "bibliometric mapping",
631
+ "collaborative visualization",
632
  "community context",
633
+ "community leaders",
634
+ "conversational agents",
635
+ "exploratory visualization",
636
  "human-centered design",
637
  "impacted communities",
638
+ "interactive displays",
639
  "participatory design",
 
640
  "rehabilitation gaming system",
641
+ "social visualizations",
642
  "socio-emotional support",
643
  "text analytics",
644
  "text visualization",
 
 
645
  "user exploration",
646
  "user participation",
 
647
  "visual ambiguities",
648
+ "visual rhetoric",
649
  "visualization literacy"
650
  ]
651
  }
data/users/nshah/embeds-nshah-doc.npy CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:04ab5259aeafadd1954db9f0d5316854792f7ef3f11f6ad1f65ff2d1e6c2d3df
3
- size 123008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ed4bb5d94e3bf2ef940c31a33ab338786620f95e7d96ab646e3498e2f14cfc0e
3
+ size 743552
data/users/nshah/embeds-nshah-sent.pickle CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:96b623d691c8703a195f50106a711d0dd1ddb38971de3cc723e2077cbdb95419
3
- size 529378
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d39fc4fe925a81ea96acbb1dacf212ff9b908fad183990fd3e3ea9a23bcc845b
3
+ size 2880694
data/users/nshah/pid2idx-nshah-doc.json CHANGED
@@ -1 +1 @@
1
- {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19}
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40, "41": 41, "42": 42, "43": 43, "44": 44, "45": 45, "46": 46, "47": 47, "48": 48, "49": 49, "50": 50, "51": 51, "52": 52, "53": 53, "54": 54, "55": 55, "56": 56, "57": 57, "58": 58, "59": 59, "60": 60, "61": 61, "62": 62, "63": 63, "64": 64, "65": 65, "66": 66, "67": 67, "68": 68, "69": 69, "70": 70, "71": 71, "72": 72, "73": 73, "74": 74, "75": 75, "76": 76, "77": 77, "78": 78, "79": 79, "80": 80, "81": 81, "82": 82, "83": 83, "84": 84, "85": 85, "86": 86, "87": 87, "88": 88, "89": 89, "90": 90, "91": 91, "92": 92, "93": 93, "94": 94, "95": 95, "96": 96, "97": 97, "98": 98, "99": 99, "100": 100, "101": 101, "102": 102, "103": 103, "104": 104, "105": 105, "106": 106, "107": 107, "108": 108, "109": 109, "110": 110, "111": 111, "112": 112, "113": 113, "114": 114, "115": 115, "116": 116, "117": 117, "118": 118, "119": 119, "120": 120}
data/users/nshah/seedset-nshah-maple.json CHANGED
The diff for this file is too large to render. See raw diff
 
data/users/smysore/embeds-smysore-doc.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e9c91d73ca1ea06211adabfd5b2d14806a63f81e20d3eac20de7a0d4905b787
3
+ size 104576
data/users/smysore/embeds-smysore-sent.pickle ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9017a37368cbdf08171138f15fbab79bc32ba7df71034e8873eb8277082d9a5c
3
+ size 360301
data/users/smysore/pid2idx-smysore-doc.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16}
data/users/smysore/seedset-smysore-maple.json ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "smysore",
3
+ "s2_authorid": "9076501",
4
+ "papers": [
5
+ {
6
+ "title": "Multi-Modal Augmentation for Large Language Models with Applications to Task-Oriented Dialogues",
7
+ "abstract": [
8
+ "We introduce MarunaBot V2, an advanced Task-Oriented Dialogue System (TODS) primarily aimed at aiding users in cooking and Do-It-Yourself tasks.",
9
+ "We utilized large language models (LLMs) for data generation and inference, and implemented hybrid methods for intent classification, retrieval, and question answering, striking a balance between efficiency and performance.",
10
+ "A key feature of our system is its multi-modal capabilities.",
11
+ "We have incorporated a multi-modal enrichment technique that uses a fine-tuned CLIP model to supplement recipe instructions with pertinent images, a custom Diffusion model for image enhancement and generation, and a method for multi-modal option matching.",
12
+ "A unique aspect of our system is its user-centric development approach, facilitated by a custom tool for tracking user interactions and swiftly integrating feedback.",
13
+ "For a demonstration of our system, visit https://youtu.be/4MNI-puv_eE ."
14
+ ]
15
+ },
16
+ {
17
+ "title": "MarunaBot V2: Towards End-to-End Multi-Modal Task-Oriented Dialogue Systems",
18
+ "abstract": [
19
+ "We introduce MarunaBot V2, an advanced Task-Oriented Dialogue System (TODS) primarily aimed at aiding users in cooking and Do-It-Yourself tasks.",
20
+ "We utilized large language models (LLMs) for data generation and inference, and implemented hybrid methods for intent classification, retrieval, and question answering, striking a balance between efficiency and performance.",
21
+ "A key feature of our system is its multi-modal capabilities.",
22
+ "We have incorporated a multi-modal enrichment technique that uses a fine-tuned CLIP model to supplement recipe instructions with pertinent images, a custom Diffusion model for image enhancement and generation, and a method for multi-modal option matching.",
23
+ "A unique aspect of our system is its user-centric development approach, facilitated by a custom tool for tracking user interactions and swiftly integrating feedback.",
24
+ "Finally, we showcase the promising results of our end-to-end retrieval-augmented LLM taskbot, MarunaChef, and set a promising precedent for future task-oriented dialogue systems."
25
+ ]
26
+ },
27
+ {
28
+ "title": "LaMP: When Large Language Models Meet Personalization",
29
+ "abstract": [
30
+ "This paper highlights the importance of personalization in large language models and introduces the LaMP benchmark -- a novel benchmark for training and evaluating language models for producing personalized outputs.",
31
+ "LaMP offers a comprehensive evaluation framework with diverse language tasks and multiple entries for each user profile.",
32
+ "It consists of seven personalized tasks, spanning three text classification and four text generation tasks.",
33
+ "We additionally propose two retrieval augmentation approaches that retrieve personal items from each user profile for personalizing language model outputs.",
34
+ "To this aim, we study various retrieval models, including term matching, semantic matching, and time-aware methods.",
35
+ "Extensive experiments on LaMP for zero-shot and fine-tuned language models demonstrate the efficacy of the proposed retrieval augmentation approach and highlight the impact of personalization in various natural language tasks."
36
+ ]
37
+ },
38
+ {
39
+ "title": "Large Language Model Augmented Narrative Driven Recommendations",
40
+ "abstract": [
41
+ "Narrative-driven recommendation (NDR) presents an information access problem where users solicit recommendations with verbose descriptions of their preferences and context, for example, travelers soliciting recommendations for points of interest while describing their likes/dislikes and travel circumstances.",
42
+ "These requests are increasingly important with the rise of natural language-based conversational interfaces for search and recommendation systems.",
43
+ "However, NDR lacks abundant training data for models, and current platforms commonly do not support these requests.",
44
+ "Fortunately, classical user-item interaction datasets contain rich textual data, e.g., reviews, which often describe user preferences and context \u2013 this may be used to bootstrap training for NDR models.",
45
+ "In this work, we explore using large language models (LLMs) for data augmentation to train NDR models.",
46
+ "We use LLMs for authoring synthetic narrative queries from user-item interactions with few-shot prompting and train retrieval models for NDR on synthetic queries and user-item interaction data.",
47
+ "Our experiments demonstrate that this is an effective strategy for training small-parameter retrieval models that outperform other retrieval and LLM baselines for narrative-driven recommendation."
48
+ ]
49
+ },
50
+ {
51
+ "title": "PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers",
52
+ "abstract": [
53
+ "Powerful large language models have facilitated the development of writing assistants that promise to significantly improve the quality and efficiency of composition and communication.",
54
+ "However, a barrier to effective assistance is the lack of personalization in LLM outputs to the author's communication style and specialized knowledge.",
55
+ "In this paper, we address this challenge by proposing PEARL, a retrieval-augmented LLM writing assistant personalized with a generation-calibrated retriever.",
56
+ "Our retriever is trained to select historic user-authored documents for prompt augmentation, such that they are likely to best personalize LLM generations for a user request.",
57
+ "We propose two key novelties for training our retriever: 1) A training data selection method that identifies user requests likely to benefit from personalization and documents that provide that benefit; and 2) A scale-calibrating KL-divergence objective that ensures that our retriever closely tracks the benefit of a document for personalized generation.",
58
+ "We demonstrate the effectiveness of PEARL in generating personalized workplace social media posts and Reddit comments.",
59
+ "Finally, we showcase the potential of a generation-calibrated retriever to double as a performance predictor and further improve low-quality generations via LLM chaining."
60
+ ]
61
+ },
62
+ {
63
+ "title": "How Data Scientists Review the Scholarly Literature",
64
+ "abstract": [
65
+ "Keeping up with the research literature plays an important role in the workflow of scientists \u2013 allowing them to understand a field, formulate the problems they focus on, and develop the solutions that they contribute, which in turn shape the nature of the discipline.",
66
+ "In this paper, we examine the literature review practices of data scientists.",
67
+ "Data science represents a field seeing an exponential rise in papers, and increasingly drawing on and being applied in numerous diverse disciplines.",
68
+ "Recent efforts have seen the development of several tools intended to help data scientists cope with a deluge of research and coordinated efforts to develop AI tools intended to uncover the research frontier.",
69
+ "Despite these trends indicative of the information overload faced by data scientists, no prior work has examined the specific practices and challenges faced by these scientists in an interdisciplinary field with evolving scholarly norms.",
70
+ "In this paper, we close this gap through a set of semi-structured interviews and think-aloud protocols of industry and academic data scientists (N = 20).",
71
+ "Our results while corroborating other knowledge workers\u2019 practices uncover several novel findings: individuals (1) are challenged in seeking and sensemaking of papers beyond their disciplinary bubbles, (2) struggle to understand papers in the face of missing details and mathematical content, (3) grapple with the deluge by leveraging the knowledge context in code, blogs, and talks, and (4) lean on their peers online and in-person.",
72
+ "Furthermore, we outline future directions likely to help data scientists cope with the burgeoning research literature."
73
+ ]
74
+ },
75
+ {
76
+ "title": "Editable User Profiles for Controllable Text Recommendations",
77
+ "abstract": [
78
+ "Methods for making high-quality recommendations often rely on learning latent representations from interaction data.",
79
+ "These methods, while performant, do not provide ready mechanisms for users to control the recommendation they receive.",
80
+ "Our work tackles this problem by proposing LACE, a novel concept value bottleneck model for controllable text recommendations.",
81
+ "LACE represents each user with a succinct set of human-readable concepts through retrieval given user-interacted documents and learns personalized representations of the concepts based on user documents.",
82
+ "This concept based user profile is then leveraged to make recommendations.",
83
+ "The design of our model affords control over the recommendations through a number of intuitive interactions with a transparent user profile.",
84
+ "We first establish the quality of recommendations obtained from LACE in an offline evaluation on three recommendation tasks spanning six datasets in warm-start, cold-start, and zero-shot setups.",
85
+ "Next, we validate the controllability of LACE under simulated user interactions.",
86
+ "Finally, we implement LACE in an interactive controllable recommender system and conduct a user study to demonstrate that users are able to improve the quality of recommendations they receive through interactions with an editable user profile."
87
+ ]
88
+ },
89
+ {
90
+ "title": "Augmenting Scientific Creativity with Retrieval across Knowledge Domains",
91
+ "abstract": [
92
+ "Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas.",
93
+ "While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas \\textit{outside} such domains.",
94
+ "In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification.",
95
+ "To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains.",
96
+ "Furthermore, end-users can `zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters.",
97
+ "Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration."
98
+ ]
99
+ },
100
+ {
101
+ "title": "CSFCube - A Test Collection of Computer Science Research Articles for Faceted Query by Example",
102
+ "abstract": [
103
+ "Query by Example is a well-known information retrieval task in which a document is chosen by the user as the search query and the goal is to retrieve relevant documents from a large collection.",
104
+ "However, a document often covers multiple aspects of a topic.",
105
+ "To address this scenario we introduce the task of faceted Query by Example in which users can also specify a finer grained aspect in addition to the input query document.",
106
+ "We focus on the application of this task in scientific literature search.",
107
+ "We envision models which are able to retrieve scientific papers analogous to a query scientific paper along specifically chosen rhetorical structure elements as one solution to this problem.",
108
+ "In this work, the rhetorical structure elements, which we refer to as facets, indicate objectives, methods, or results of a scientific paper.",
109
+ "We introduce and describe an expert annotated test collection to evaluate models trained to perform this task.",
110
+ "Our test collection consists of a diverse set of 50 query documents in English, drawn from computational linguistics and machine learning venues.",
111
+ "We carefully follow the annotation guideline used by TREC for depth-k pooling (k = 100 or 250) and the resulting data collection consists of graded relevance scores with high annotation agreement.",
112
+ "State of the art models evaluated on our dataset show a significant gap to be closed in further work.",
113
+ "Our dataset may be accessed here: https://github.com/iesl/CSFCube"
114
+ ]
115
+ },
116
+ {
117
+ "title": "MS-Mentions: Consistently Annotating Entity Mentions in Materials Science Procedural Text",
118
+ "abstract": [
119
+ "Material science synthesis procedures are a promising domain for scientific NLP, as proper modeling of these recipes could provide insight into new ways of creating materials.",
120
+ "However, a fundamental challenge in building information extraction models for material science synthesis procedures is getting accurate labels for the materials, operations, and other entities of those procedures.",
121
+ "We present a new corpus of entity mention annotations over 595 Material Science synthesis procedural texts (157,488 tokens), which greatly expands the training data available for the Named Entity Recognition task.",
122
+ "We outline a new label inventory designed to provide consistent annotations and a new annotation approach intended to maximize the consistency and annotation speed of domain experts.",
123
+ "Inter-annotator agreement studies and baseline models trained upon the data suggest that the corpus provides high-quality annotations of these mention types.",
124
+ "This corpus helps lay a foundation for future high-quality modeling of synthesis procedures."
125
+ ]
126
+ },
127
+ {
128
+ "title": "Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity",
129
+ "abstract": [
130
+ "We present a new scientific document similarity model based on matching fine-grained aspects of texts.",
131
+ "To train our model, we exploit a naturally-occurring source of supervision: sentences in the full-text of papers that cite multiple papers together (co-citations).",
132
+ "Such co-citations not only reflect close paper relatedness, but also provide textual descriptions of how the co-cited papers are related.",
133
+ "This novel form of textual supervision is used for learning to match aspects across papers.",
134
+ "We develop multi-vector representations where vectors correspond to sentence-level aspects of documents, and present two methods for aspect matching: (1) A fast method that only matches single aspects, and (2) a method that makes sparse multiple matches with an Optimal Transport mechanism that computes an Earth Mover\u2019s Distance between aspects.",
135
+ "Our approach improves performance on document similarity tasks in four datasets.",
136
+ "Further, our fast single-match method achieves competitive results, paving the way for applying fine-grained similarity to large scientific corpora."
137
+ ]
138
+ },
139
+ {
140
+ "title": "An Instance Level Approach for Shallow Semantic Parsing in Scientific Procedural Text",
141
+ "abstract": [
142
+ "In specific domains, such as procedural scientific text, human labeled data for shallow semantic parsing is especially limited and expensive to create.",
143
+ "Fortunately, such specific domains often use rather formulaic writing, such that the different ways of expressing relations in a small number of grammatically similar labeled sentences may provide high coverage of semantic structures in the corpus, through an appropriately rich similarity metric.",
144
+ "In light of this opportunity, this paper explores an instance-based approach to the relation prediction sub-task within shallow semantic parsing, in which semantic labels from structurally similar sentences in the training set are copied to test sentences.",
145
+ "Candidate similar sentences are retrieved using SciBERT embeddings.",
146
+ "For labels where it is possible to copy from a similar sentence we employ an instance level copy network, when this is not possible, a globally shared parametric model is employed.",
147
+ "Experiments show our approach outperforms both baseline and prior methods by 0.75 to 3 F1 absolute in the Wet Lab Protocol Corpus and 1 F1 absolute in the Materials Science Procedural Text Corpus."
148
+ ]
149
+ },
150
+ {
151
+ "title": "The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures",
152
+ "abstract": [
153
+ "Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text.",
154
+ "Large-scale analysis of these synthesis procedures would facilitate deeper scientific understanding of materials synthesis and enable automated synthesis planning.",
155
+ "Such analysis requires extracting structured representations of synthesis procedures from the raw text as a first step.",
156
+ "To facilitate the training and evaluation of synthesis extraction models, we introduce a dataset of 230 synthesis procedures annotated by domain experts with labeled graphs that express the semantics of the synthesis sentences.",
157
+ "The nodes in this graph are synthesis operations and their typed arguments, and labeled edges specify relations between the nodes.",
158
+ "We describe this new resource in detail and highlight some specific challenges to annotating scientific text with shallow semantic structure.",
159
+ "We make the corpus available to the community to promote further research and development of scientific information extraction systems."
160
+ ]
161
+ },
162
+ {
163
+ "title": "Roll Call Vote Prediction with Knowledge Augmented Models",
164
+ "abstract": [
165
+ "The official voting records of United States congresspeople are preserved as roll call votes.",
166
+ "Prediction of voting behavior of politicians for whom no voting record exists, such as individuals running for office, is important for forecasting key political decisions.",
167
+ "Prior work has relied on past votes cast to predict future votes, and thus fails to predict voting patterns for politicians without voting records.",
168
+ "We address this by augmenting a prior state of the art model with multiple sources of external knowledge so as to enable prediction on unseen politicians.",
169
+ "The sources of knowledge we use are news text and Freebase, a manually curated knowledge base.",
170
+ "We propose augmentations based on unigram features for news text, and a knowledge base embedding method followed by a neural network composition for relations from Freebase.",
171
+ "Empirical evaluation of these approaches indicate that the proposed models outperform the prior system for politicians with complete historical voting records by 1.0% point of accuracy (8.7% error reduction) and for politicians without voting records by 33.4% points of accuracy (66.7% error reduction).",
172
+ "We also show that the knowledge base augmented approach outperforms the news text augmented approach by 4.2% points of accuracy."
173
+ ]
174
+ },
175
+ {
176
+ "title": "Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks",
177
+ "abstract": [
178
+ "Leveraging new data sources is a key step in accelerating the pace of materials design and discovery.",
179
+ "To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated, unsupervised method for connecting scientific literature to inorganic synthesis insights.",
180
+ "Starting from natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for any inorganic materials of interest.",
181
+ "We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses.",
182
+ "We demonstrate that the model learns representations of materials corresponding to synthesis-related properties, and that the model's behavior complements existing thermodynamic knowledge.",
183
+ "Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds."
184
+ ]
185
+ },
186
+ {
187
+ "title": "Automatically Extracting Action Graphs from Materials Science Synthesis Procedures",
188
+ "abstract": [
189
+ "Computational synthesis planning approaches have achieved recent success in organic chemistry, where tabulated synthesis procedures are readily available for supervised learning.",
190
+ "The syntheses of inorganic materials, however, exist primarily as natural language narratives contained within scientific journal articles.",
191
+ "This synthesis information must first be extracted from the text in order to enable analogous synthesis planning methods for inorganic materials.",
192
+ "In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds.",
193
+ "We define the structured representation as a set of linked events made up of extracted scientific entities and evaluate two unsupervised approaches for extracting these structures on expert-annotated articles: a strong heuristic baseline and a generative model of procedural text.",
194
+ "We also evaluate a variety of supervised models for extracting scientific entities.",
195
+ "Our results provide insight into the nature of the data and directions for further work in this exciting new area of research."
196
+ ]
197
+ },
198
+ {
199
+ "title": "Complex and degraded color document image binarization",
200
+ "abstract": [
201
+ "We present a document binarization scheme that is intended at consistently binarizing a range of degraded color document images.",
202
+ "The proposed solution makes use of a mean-shift algorithm based segmentation applied at different scales of the image and a contrast enhanced version of the popular Niblack's thresholding method.",
203
+ "The solution has been evaluated using standard metrics used in a prominent binarization competition and has also been subject to an end-to-end evaluation by use in an OCR system.",
204
+ "The proposed solution was found to perform at par or better than existing state of the art binarization solutions and was found to always be more consistent in performance than the state of the art."
205
+ ]
206
+ }
207
+ ],
208
+ "user_kps": [
209
+ "chemical libraries",
210
+ "citation context analysis",
211
+ "content-based recommenders",
212
+ "dialogue model",
213
+ "dialogue systems",
214
+ "document image binarization",
215
+ "electorate",
216
+ "exploratory searches",
217
+ "factored language models",
218
+ "inter-annotator agreement",
219
+ "literature-based discovery",
220
+ "multi-modal features",
221
+ "paraphrastic sentence embeddings",
222
+ "physical synthesis",
223
+ "recurrent neural network language model",
224
+ "scholarly productivity",
225
+ "semantic labeling",
226
+ "semantic role labeling",
227
+ "topic modeling",
228
+ "writer adaptive training"
229
+ ]
230
+ }
data/users/upaquet/embeds-upaquet-doc.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:201ca3eed3ec63a5ef90b77886c017cf5733e758413e900a9c30751234e0be31
3
+ size 252032
data/users/upaquet/embeds-upaquet-sent.pickle ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbee3b9b1ca7037b55facc1745a9debf2b095cb2bfd4688727e0747bf35379dd
3
+ size 861973
data/users/upaquet/pid2idx-upaquet-doc.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4, "5": 5, "6": 6, "7": 7, "8": 8, "9": 9, "10": 10, "11": 11, "12": 12, "13": 13, "14": 14, "15": 15, "16": 16, "17": 17, "18": 18, "19": 19, "20": 20, "21": 21, "22": 22, "23": 23, "24": 24, "25": 25, "26": 26, "27": 27, "28": 28, "29": 29, "30": 30, "31": 31, "32": 32, "33": 33, "34": 34, "35": 35, "36": 36, "37": 37, "38": 38, "39": 39, "40": 40}
data/users/upaquet/seedset-upaquet-maple.json ADDED
@@ -0,0 +1,523 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "upaquet",
3
+ "s2_authorid": "1722403",
4
+ "papers": [
5
+ {
6
+ "title": "Efficient Bayesian inference of instantaneous reproduction numbers at fine spatial scales, with an application to mapping and nowcasting the Covid\u201019 epidemic in British local authorities",
7
+ "abstract": [
8
+ "The spatio-temporal pattern of Covid-19 infections, as for most infectious disease epidemics, is highly heterogenous as a consequence of local variations in risk factors and exposures.",
9
+ "Consequently, the widely quoted national-level estimates of reproduction numbers are of limited value in guiding local interventions and monitoring their effectiveness.",
10
+ "It is crucial for national and local policy-makers, and for health protection teams, that accurate, well-calibrated and timely predictions of Covid-19 incidences and transmission rates are available at fine spatial scales.",
11
+ "Obtaining such estimates is challenging, not least due to the prevalence of asymptomatic Covid-19 transmissions, as well as difficulties of obtaining high-resolution and high-frequency data.",
12
+ "In addition, low case counts at a local level further confounds the inference for Covid-19 transmission rates, adding unwelcome uncertainty.",
13
+ "In this paper we develop a hierarchical Bayesian method for inference of transmission rates at fine spatial scales.",
14
+ "Our model incorporates both temporal and spatial dependencies of local transmission rates in order to share statistical strength and reduce uncertainty.",
15
+ "It also incorporates"
16
+ ]
17
+ },
18
+ {
19
+ "title": "Acquisition of chess knowledge in AlphaZero",
20
+ "abstract": [
21
+ "Significance Seventy years ago, Alan Turing conjectured that a chess-playing machine could be built that would self-learn and continuously profit from its own experience.",
22
+ "The AlphaZero system\u2014a neural network-powered reinforcement learner\u2014passed this milestone.",
23
+ "In this paper, we ask the following questions.",
24
+ "How did it do it?",
25
+ "What did it learn from its experience, and how did it encode it?",
26
+ "Did it learn anything like a human understanding of chess, in spite of having never seen a human game?",
27
+ "Remarkably, we find many strong correspondences between human concepts and AlphaZero\u2019s representations that emerge during training, even though none of these concepts were initially present in the network."
28
+ ]
29
+ },
30
+ {
31
+ "title": "Role of Human-AI Interaction in Selective Prediction",
32
+ "abstract": [
33
+ "Recent work has shown the potential benefit of selective prediction systems that can learn to defer to a human when the predictions of the AI are unreliable, particularly to improve the reliability of AI systems in high-stakes applications like healthcare or conservation.",
34
+ "However, most prior work assumes that human behavior remains unchanged when they solve a prediction task as part of a human-AI team as opposed to by themselves.",
35
+ "We show that this is not the case by performing experiments to quantify human-AI interaction in the context of selective prediction.",
36
+ "In particular, we study the impact of communicating different types of information to humans about the AI system's decision to defer.",
37
+ "Using real-world conservation data and a selective prediction system that improves expected accuracy over that of the human or AI system working individually, we show that this messaging has a significant impact on the accuracy of human judgements.",
38
+ "Our results study two components of the messaging strategy: 1) Whether humans are informed about the prediction of the AI system and 2) Whether they are informed about the decision of the selective prediction system to defer.",
39
+ "By manipulating these messaging components, we show that it is possible to significantly boost human performance by informing the human of the decision to defer, but not revealing the prediction of the AI.",
40
+ "We therefore show that it is vital to consider how the decision to defer is communicated to a human when designing selective prediction systems, and that the composite accuracy of a human-AI team must be carefully evaluated using a human-in-the-loop framework."
41
+ ]
42
+ },
43
+ {
44
+ "title": "Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess",
45
+ "abstract": [
46
+ "It is non-trivial to design engaging and balanced sets of game rules.",
47
+ "Modern chess has evolved over centuries, but without a similar recourse to history, the consequences of rule changes to game dynamics are difficult to predict.",
48
+ "AlphaZero provides an alternative in silico means of game balance assessment.",
49
+ "It is a system that can learn near-optimal strategies for any rule set from scratch, without any human supervision, by continually learning from its own experience.",
50
+ "In this study we use AlphaZero to creatively explore and design new chess variants.",
51
+ "There is growing interest in chess variants like Fischer Random Chess, because of classical chess's voluminous opening theory, the high percentage of draws in professional play, and the non-negligible number of games that end while both players are still in their home preparation.",
52
+ "We compare nine other variants that involve atomic changes to the rules of chess.",
53
+ "The changes allow for novel strategic and tactical patterns to emerge, while keeping the games close to the original.",
54
+ "By learning near-optimal strategies for each variant with AlphaZero, we determine what games between strong human players might look like if these variants were adopted.",
55
+ "Qualitatively, several variants are very dynamic.",
56
+ "An analytic comparison show that pieces are valued differently between variants, and that some variants are more decisive than classical chess.",
57
+ "Our findings demonstrate the rich possibilities that lie beyond the rules of modern chess."
58
+ ]
59
+ },
60
+ {
61
+ "title": "An Efficient Implementation of Riemannian Manifold Hamiltonian Monte Carlo for Gaussian Process Models",
62
+ "abstract": [
63
+ "This technical report presents pseudo-code for a Riemannian manifold Hamiltonian Monte Carlo (RMHMC) method to efficiently simulate samples from $N$-dimensional posterior distributions $p(x|y)$, where $x \\in R^N$ is drawn from a Gaussian Process (GP) prior, and observations $y_n$ are independent given $x_n$. Sufficient technical and algorithmic details are provided for the implementation of RMHMC for distributions arising from GP priors."
64
+ ]
65
+ },
66
+ {
67
+ "title": "Recurrent Relational Networks for Complex Relational Reasoning",
68
+ "abstract": [
69
+ "Humans possess an ability to abstractly reason about objects and their interactions, an ability not shared with state-of-the-art deep learning models.",
70
+ "Relational networks, introduced by Santoro et al. (2017), add the capacity for relational reasoning to deep neural networks, but are limited in the complexity of the reasoning tasks they can address.",
71
+ "We introduce recurrent relational networks which increase the suite of solvable tasks to those that require an order of magnitude more steps of relational reasoning.",
72
+ "We use recurrent relational networks to solve Sudoku puzzles and achieve state-of-the-art results by solving 96.6% of the hardest Sudoku puzzles, where relational networks fail to solve any.",
73
+ "We also apply our model to the BaBi textual QA dataset solving 19/20 tasks which is competitive with state-of-the-art sparse differentiable neural computers.",
74
+ "The recurrent relational network is a general purpose module that can augment any neural network model with the capacity to do many-step relational reasoning."
75
+ ]
76
+ },
77
+ {
78
+ "title": "A Factorial Mixture Prior for Compositional Deep Generative Models",
79
+ "abstract": [
80
+ "We assume that a high-dimensional datum, like an image, is a compositional expression of a set of properties, with a complicated non-linear relationship between the datum and its properties.",
81
+ "This paper proposes a factorial mixture prior for capturing latent properties, thereby adding structured compositionality to deep generative models.",
82
+ "The prior treats a latent vector as belonging to Cartesian product of subspaces, each of which is quantized separately with a Gaussian mixture model.",
83
+ "Some mixture components can be set to represent properties as observed random variables whenever labeled properties are present.",
84
+ "Through a combination of stochastic variational inference and gradient descent, a method for learning how to infer discrete properties in an unsupervised or semi-supervised way is outlined and empirically evaluated."
85
+ ]
86
+ },
87
+ {
88
+ "title": "Recurrent Relational Networks",
89
+ "abstract": [
90
+ "This paper is concerned with learning to solve tasks that require a chain of interdependent steps of relational inference, like answering complex questions about the relationships between objects, or solving puzzles where the smaller elements of a solution mutually constrain each other.",
91
+ "We introduce the recurrent relational network, a general purpose module that operates on a graph representation of objects.",
92
+ "As a generalization of Santoro et al. [2017]'s relational network, it can augment any neural network model with the capacity to do many-step relational reasoning.",
93
+ "We achieve state of the art results on the bAbI textual question-answering dataset with the recurrent relational network, consistently solving 20/20 tasks.",
94
+ "As bAbI is not particularly challenging from a relational reasoning point of view, we introduce Pretty-CLEVR, a new diagnostic dataset for relational reasoning.",
95
+ "In the Pretty-CLEVR set-up, we can vary the question to control for the number of relational reasoning steps that are required to obtain the answer.",
96
+ "Using Pretty-CLEVR, we probe the limitations of multi-layer perceptrons, relational and recurrent relational networks.",
97
+ "Finally, we show how recurrent relational networks can learn to solve Sudoku puzzles from supervised training data, a challenging task requiring upwards of 64 steps of relational reasoning.",
98
+ "We achieve state-of-the-art results amongst comparable methods by solving 96.6% of the hardest Sudoku puzzles."
99
+ ]
100
+ },
101
+ {
102
+ "title": "Knowing What to Ask: A Bayesian Active Learning Approach to the Surveying Problem",
103
+ "abstract": [
104
+ "\n \n We examine the surveying problem, where we attempt to predict how a target user is likely to respond to questions by iteratively querying that user, collaboratively based on the responses of a sample set of users.",
105
+ "We focus on an active learning approach, where the next question we select to ask the user depends on their responses to the previous questions.",
106
+ "We propose a method for solving the problem based on a Bayesian dimensionality reduction technique.",
107
+ "We empirically evaluate our method, contrasting it to benchmark approaches based on augmented linear regression, and show that it achieves much better predictive performance, and is much more robust when there is missing data.",
108
+ "\n \n"
109
+ ]
110
+ },
111
+ {
112
+ "title": "A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning",
113
+ "abstract": [
114
+ "This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world.",
115
+ "We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object's representation, coming from a recognition model, and a latent state describing its dynamics.",
116
+ "As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step.",
117
+ "The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks."
118
+ ]
119
+ },
120
+ {
121
+ "title": "Groove Radio: A Bayesian Hierarchical Model for Personalized Playlist Generation",
122
+ "abstract": [
123
+ "This paper describes an algorithm designed for Microsoft's Groove music service, which serves millions of users world wide.",
124
+ "We consider the problem of automatically generating personalized music playlists based on queries containing a ``seed'' artist and the listener's user ID.",
125
+ "Playlist generation may be informed by a number of information sources including: user specific listening patterns, domain knowledge encoded in a taxonomy, acoustic features of audio tracks, and overall popularity of tracks and artists.",
126
+ "The importance assigned to each of these information sources may vary depending on the specific combination of user and seed~artist.",
127
+ "The paper presents a method based on a variational Bayes solution for learning the parameters of a model containing a four-level hierarchy of global preferences, genres, sub-genres and artists.",
128
+ "The proposed model further incorporates a personalization component for user-specific preferences.",
129
+ "Empirical evaluations on both proprietary and public datasets demonstrate the effectiveness of the algorithm and showcase the contribution of each of its components."
130
+ ]
131
+ },
132
+ {
133
+ "title": "An Adaptive Resample-Move Algorithm for Estimating Normalizing Constants",
134
+ "abstract": [
135
+ "The estimation of normalizing constants is a fundamental step in probabilistic model comparison.",
136
+ "Sequential Monte Carlo methods may be used for this task and have the advantage of being inherently parallelizable.",
137
+ "However, the standard choice of using a fixed number of particles at each iteration is suboptimal because some steps will contribute disproportionately to the variance of the estimate.",
138
+ "We introduce an adaptive version of the Resample-Move algorithm, in which the particle set is adaptively expanded whenever a better approximation of an intermediate distribution is needed.",
139
+ "The algorithm builds on the expression for the optimal number of particles and the corresponding minimum variance found under ideal conditions.",
140
+ "Benchmark results on challenging Gaussian Process Classification and Restricted Boltzmann Machine applications show that Adaptive Resample-Move (ARM) estimates the normalizing constant with a smaller variance, using less computational resources, than either Resample-Move with a fixed number of particles or Annealed Importance Sampling.",
141
+ "A further advantage over Annealed Importance Sampling is that ARM is easier to tune."
142
+ ]
143
+ },
144
+ {
145
+ "title": "Bayesian Low-Rank Determinantal Point Processes",
146
+ "abstract": [
147
+ "Determinantal point processes (DPPs) are an emerging model for encoding probabilities over subsets, such as shopping baskets, selected from a ground set, such as an item catalog.",
148
+ "They have recently proved to be appealing models for a number of machine learning tasks, including product recommendation.",
149
+ "DPPs are parametrized by a positive semi-definite kernel matrix.",
150
+ "Prior work has shown that using a low-rank factorization of this kernel provides scalability improvements that open the door to training on large-scale datasets and computing online recommendations, both of which are infeasible with standard DPP models that use a full-rank kernel.",
151
+ "A low-rank DPP model can be trained using an optimization-based method, such as stochastic gradient ascent, to find a point estimate of the kernel parameters, which can be performed efficiently on large-scale datasets.",
152
+ "However, this approach requires careful tuning of regularization parameters to prevent overfitting and provide good predictive performance, which can be computationally expensive.",
153
+ "In this paper we present a Bayesian method for learning a low-rank factorization of this kernel, which provides automatic control of regularization.",
154
+ "We show that our Bayesian low-rank DPP model can be trained efficiently using stochastic gradient Hamiltonian Monte Carlo (SGHMC).",
155
+ "Our Bayesian model generally provides better predictive performance on several real-world product recommendation datasets than optimization-based low-rank DPP models trained using stochastic gradient ascent, and better performance than several state-of-the art recommendation methods in many cases."
156
+ ]
157
+ },
158
+ {
159
+ "title": "Indexable Probabilistic Matrix Factorization for Maximum Inner Product Search",
160
+ "abstract": [
161
+ "\n \n The Maximum Inner Product Search (MIPS) problem, prevalent in matrix factorization-based recommender systems, scales linearly with the number of objects to score.",
162
+ "Recent work has shown that clever post-processing steps can turn the MIPS problem into a nearest neighbour one, allowing sublinear retrieval time either through Locality Sensitive Hashing or various tree structures that partition the Euclidian space.",
163
+ "This work shows that instead of employing post-processing steps, substantially faster retrieval times can be achieved for the same accuracy when inference is not decoupled from the indexing process.",
164
+ "By framing matrix factorization to be natively indexable, so that any solution is immediately sublinearly searchable, we use the machinery of Machine Learning to best learn such a solution.",
165
+ "We introduce Indexable Probabilistic Matrix Factorization (IPMF) to shift the traditional post-processing complexity into the training phase of the model.",
166
+ "Its inference procedure is based on Geodesic Monte Carlo, and adds minimal additional computational cost to standard Monte Carlo methods for matrix factorization.",
167
+ "By coupling inference and indexing in this way, we achieve more than a 50% improvement in retrieval time against two state of the art methods, for a given level of accuracy in the recommendations of two large-scale recommender systems.",
168
+ "\n \n"
169
+ ]
170
+ },
171
+ {
172
+ "title": "Collective Noise Contrastive Estimation for Policy Transfer Learning",
173
+ "abstract": [
174
+ "\n \n We address the problem of learning behaviour policies to optimise online metrics from heterogeneous usage data.",
175
+ "While online metrics, e.g., click-through rate, can be optimised effectively using exploration data, such data is costly to collect in practice, as it temporarily degrades the user experience.",
176
+ "Leveraging related data sources to improve online performance would be extremely valuable, but is not possible using current approaches.",
177
+ "We formulate this task as a policy transfer learning problem, and propose a first solution, called collective noise contrastive estimation (collective NCE).",
178
+ "NCE is an efficient solution to approximating the gradient of a log-softmax objective.",
179
+ "Our approach jointly optimises embeddings of heterogeneous data to transfer knowledge from the source domain to the target domain.",
180
+ "We demonstrate the effectiveness of our approach by learning an effective policy for an online radio station jointly from user-generated playlists, and usage data collected in an exploration bucket.",
181
+ "\n \n"
182
+ ]
183
+ },
184
+ {
185
+ "title": "Beyond Collaborative Filtering: The List Recommendation Problem",
186
+ "abstract": [
187
+ "Most Collaborative Filtering (CF) algorithms are optimized using a dataset of isolated user-item tuples.",
188
+ "However, in commercial applications recommended items are usually served as an ordered list of several items and not as isolated items.",
189
+ "In this setting, inter-item interactions have an effect on the list's Click-Through Rate (CTR) that is unaccounted for using traditional CF approaches.",
190
+ "Most CF approaches also ignore additional important factors like click propensity variation, item fatigue, etc.",
191
+ "In this work, we introduce the list recommendation problem.",
192
+ "We present useful insights gleaned from user behavior and consumption patterns from a large scale real world recommender system.",
193
+ "We then propose a novel two-layered framework that builds upon existing CF algorithms to optimize a list's click probability.",
194
+ "Our approach accounts for inter-item interactions as well as additional information such as item fatigue, trendiness patterns, contextual information etc.",
195
+ "Finally, we evaluate our approach using a novel adaptation of Inverse Propensity Scoring (IPS) which facilitates off-policy estimation of our method's CTR and showcases its effectiveness in real-world settings."
196
+ ]
197
+ },
198
+ {
199
+ "title": "Sequential Neural Models with Stochastic Layers",
200
+ "abstract": [
201
+ "How can we efficiently propagate uncertainty in a latent state representation with recurrent neural networks?",
202
+ "This paper introduces stochastic recurrent neural networks which glue a deterministic recurrent neural network and a state space model together to form a stochastic and sequential neural generative model.",
203
+ "The clear separation of deterministic and stochastic layers allows a structured variational inference network to track the factorization of the model's posterior distribution.",
204
+ "By retaining both the nonlinear recursive structure of a recurrent neural network and averaging over the uncertainty in a latent path, like a state space model, we improve the state of the art results on the Blizzard and TIMIT speech modeling data sets by a large margin, while achieving comparable performances to competing methods on polyphonic music modeling."
205
+ ]
206
+ },
207
+ {
208
+ "title": "The Bayesian Low-Rank Determinantal Point Process Mixture Model",
209
+ "abstract": [
210
+ "Determinantal point processes (DPPs) are an elegant model for encoding probabilities over subsets, such as shopping baskets, of a ground set, such as an item catalog.",
211
+ "They are useful for a number of machine learning tasks, including product recommendation.",
212
+ "DPPs are parametrized by a positive semi-definite kernel matrix.",
213
+ "Recent work has shown that using a low-rank factorization of this kernel provides remarkable scalability improvements that open the door to training on large-scale datasets and computing online recommendations, both of which are infeasible with standard DPP models that use a full-rank kernel.",
214
+ "In this paper we present a low-rank DPP mixture model that allows us to represent the latent structure present in observed subsets as a mixture of a number of component low-rank DPPs, where each component DPP is responsible for representing a portion of the observed data.",
215
+ "The mixture model allows us to effectively address the capacity constraints of the low-rank DPP model.",
216
+ "We present an efficient and scalable Markov Chain Monte Carlo (MCMC) learning algorithm for our model that uses Gibbs sampling and stochastic gradient Hamiltonian Monte Carlo (SGHMC).",
217
+ "Using an evaluation on several real-world product recommendation datasets, we show that our low-rank DPP mixture model provides substantially better predictive performance than is possible with a single low-rank or full-rank DPP, and significantly better performance than several other competing recommendation methods in many cases."
218
+ ]
219
+ },
220
+ {
221
+ "title": "Low-Rank Factorization of Determinantal Point Processes",
222
+ "abstract": [
223
+ "\n \n Determinantal point processes (DPPs) have garnered attention as an elegant probabilistic model of set diversity.",
224
+ "They are useful for a number of subset selection tasks, including product recommendation.",
225
+ "DPPs are parametrized by a positive semi-definite kernel matrix.",
226
+ "In this work we present a new method for learning the DPP kernel from observed data using a low-rank factorization of this kernel.",
227
+ "We show that this low-rank factorization enables a learning algorithm that is nearly an order of magnitude faster than previous approaches, while also providing for a method for computing product recommendation predictions that is far faster (up to 20x faster or more for large item catalogs) than previous techniques that involve a full-rank DPP kernel.",
228
+ "Furthermore, we show that our method provides equivalent or sometimes better test log-likelihood than prior full-rank DPP approaches.",
229
+ "\n \n"
230
+ ]
231
+ },
232
+ {
233
+ "title": "On the Convergence of Stochastic Variational Inference in Bayesian Networks",
234
+ "abstract": [
235
+ "We highlight a pitfall when applying stochastic variational inference to general Bayesian networks.",
236
+ "For global random variables approximated by an exponential family distribution, natural gradient steps, commonly starting from a unit length step size, are averaged to convergence.",
237
+ "This useful insight into the scaling of initial step sizes is lost when the approximation factorizes across a general Bayesian network, and care must be taken to ensure convergence in practice.",
238
+ "We experimentally investigate how much of the baby (well-scaled steps) is thrown out with the bath water (exact gradients)."
239
+ ]
240
+ },
241
+ {
242
+ "title": "Perturbation Theory for Variational Inference",
243
+ "abstract": [
244
+ "The variational approximation is known to underestimate the true variance of a probability measure, like a posterior distribution.",
245
+ "In latent variable models whose parameters are permutation-invariant with respect to the likelihood, the variational approximation might self-prune and ignore its parameters.",
246
+ "We view the variational free energy and its accompanying evidence lower bound as a first-order term from a perturbation of the true log partition function and derive a power series of corrections.",
247
+ "We sketch by means of a \u201cvariational matrix factorization\u201d example how a further term could correct for predictions from a self-pruned approximation."
248
+ ]
249
+ },
250
+ {
251
+ "title": "A large-scale exploration of group viewing patterns",
252
+ "abstract": [
253
+ "We present a large-scale study of television viewing habits, focusing on how individuals adapt their preferences when consuming content with others.",
254
+ "While there has been a great deal of research on modeling individual preferences, there has been considerably less work studying the preferences of groups, due mostly to the difficulty of collecting group data.",
255
+ "In contrast to most past work that has relied either on small-scale surveys, prototypes, or a relatively limited amount of group preference data, we explore more than 4 million logged household views paired with individual-level demographic and co-viewing information.",
256
+ "Our analysis reveals how engagement in group viewing varies by viewer and content type, and how viewing patterns shift across various group contexts.",
257
+ "Furthermore, we leverage this large-scale dataset to directly estimate how individual preferences are combined in group settings, finding subtle deviations from traditional models of preference aggregation.",
258
+ "We present a simple model which captures these effects and discuss the impact of these findings on the design of group recommendation systems."
259
+ ]
260
+ },
261
+ {
262
+ "title": "Scalable Bayesian Modelling of Paired Symbols",
263
+ "abstract": [
264
+ "We present a novel, scalable and Bayesian approach to modelling the occurrence of pairs of symbols (i,j) drawn from a large vocabulary.",
265
+ "Observed pairs are assumed to be generated by a simple popularity based selection process followed by censoring using a preference function.",
266
+ "By basing inference on the well-founded principle of variational bounding, and using new site-independent bounds, we show how a scalable inference procedure can be obtained for large data sets.",
267
+ "State of the art results are presented on real-world movie viewing data."
268
+ ]
269
+ },
270
+ {
271
+ "title": "Speeding up the Xbox recommender system using a euclidean transformation for inner-product spaces",
272
+ "abstract": [
273
+ "A prominent approach in collaborative filtering based recommender systems is using dimensionality reduction (matrix factorization) techniques to map users and items into low-dimensional vectors.",
274
+ "In such systems, a higher inner product between a user vector and an item vector indicates that the item better suits the user's preference.",
275
+ "Traditionally, retrieving the most suitable items is done by scoring and sorting all items.",
276
+ "Real world online recommender systems must adhere to strict response-time constraints, so when the number of items is large, scoring all items is intractable.",
277
+ "\n We propose a novel order preserving transformation, mapping the maximum inner product search problem to Euclidean space nearest neighbor search problem.",
278
+ "Utilizing this transformation, we study the efficiency of several (approximate) nearest neighbor data structures.",
279
+ "Our final solution is based on a novel use of the PCA-Tree data structure in which results are augmented using paths one hamming distance away from the query (neighborhood boosting).",
280
+ "The end result is a system which allows approximate matches (items with relatively high inner product, but not necessarily the highest one).",
281
+ "We evaluate our techniques on two large-scale recommendation datasets, Xbox Movies and Yahoo~Music, and show that this technique allows trading off a slight degradation in the recommendation quality for a significant improvement in the retrieval time."
282
+ ]
283
+ },
284
+ {
285
+ "title": "One-class Collaborative Filtering with Random Graphs: Annotated Version",
286
+ "abstract": [
287
+ "The bane of one-class collaborative filtering is interpreting and modelling the latent signal from the missing class.",
288
+ "In this paper we present a novel Bayesian generative model for implicit collaborative filtering.",
289
+ "It forms a core component of the Xbox Live architecture, and unlike previous approaches, delineates the odds of a user disliking an item from simply not considering it.",
290
+ "The latent signal is treated as an unobserved random graph connecting users with items they might have encountered.",
291
+ "We demonstrate how large-scale distributed learning can be achieved through a combination of stochastic gradient descent and mean field variational inference over random graph samples.",
292
+ "A fine-grained comparison is done against a state of the art baseline on real world data."
293
+ ]
294
+ },
295
+ {
296
+ "title": "Xbox movies recommendations: variational bayes matrix factorization with embedded feature selection",
297
+ "abstract": [
298
+ "We present a matrix factorization model inspired by challenges we encountered while working on the Xbox movies recommendation system.",
299
+ "The item catalog in a recommender system is typically equipped with meta-data features in the form of labels.",
300
+ "However, only part of these features are informative or useful with regard to collaborative filtering.",
301
+ "By incorporating a novel sparsity prior on feature parameters, the model automatically discerns and utilizes informative features while simultaneously pruning non-informative features.",
302
+ "The model is designed for binary feedback, which is common in many real-world systems where numeric rating data is scarce or non-existent.",
303
+ "However, the overall framework is applicable to any likelihood function.",
304
+ "Model parameters are estimated with a Variational Bayes inference algorithm, which is robust to over-fitting and does not require cross-validation and fine tuning of regularization coefficients.",
305
+ "The efficacy of our method is illustrated on a sample from the Xbox movies dataset as well as on the publicly available MovieLens dataset.",
306
+ "In both cases, the proposed solution provides superior predictive accuracy, especially for long-tail items.",
307
+ "We then demonstrate the feature selection capabilities and compare against the common case of simple Gaussian priors.",
308
+ "Finally, we show that even without features, our model performs better than a baseline model trained with the popular stochastic gradient descent approach."
309
+ ]
310
+ },
311
+ {
312
+ "title": "One-class collaborative filtering with random graphs",
313
+ "abstract": [
314
+ "The bane of one-class collaborative filtering is interpreting and modelling the latent signal from the missing class.",
315
+ "In this paper we present a novel Bayesian generative model for implicit collaborative filtering.",
316
+ "It forms a core component of the Xbox Live architecture, and unlike previous approaches, delineates the odds of a user disliking an item from simply being unaware of it.",
317
+ "The latent signal is treated as an unobserved random graph connecting users with items they might have encountered.",
318
+ "We demonstrate how large-scale distributed learning can be achieved through a combination of stochastic gradient descent and mean field variational inference over random graph samples.",
319
+ "A fine-grained comparison is done against a state of the art baseline on real world data."
320
+ ]
321
+ },
322
+ {
323
+ "title": "Mining Large-scale TV Group Viewing Patterns for Group Recommendation",
324
+ "abstract": [
325
+ "We present a large-scale study of television viewing habits, focusing on how individuals adapt their preferences when consuming content in group settings.",
326
+ "While there has been a great deal of recent work on modeling individual preferences , there has been considerably less work studying the behavior and preferences of groups, due mostly to the difficulty of data collection in these settings.",
327
+ "In contrast to past work that has relied either on small-scale surveys or prototypes , we explore more than 4 million logged views paired with individual-level demographic and co-viewing information to uncover variation in the viewing patterns of individuals and groups.",
328
+ "Our analysis reveals which genres are popular among specific demographic groups when viewed individually , how often individuals from different demographic categories participate in group viewing, and how viewing patterns change in various group contexts.",
329
+ "Furthermore, we leverage this large-scale dataset to directly estimate how individual preferences are combined in group settings, finding subtle deviations from traditional preference aggregation functions.",
330
+ "We present a simple model which captures these effects and discuss the impact of these findings on the design of group recommendation systems."
331
+ ]
332
+ },
333
+ {
334
+ "title": "Perturbative corrections for approximate inference in Gaussian latent variable models",
335
+ "abstract": [
336
+ "Expectation Propagation (EP) provides a framework for approximate inference.",
337
+ "When the model under consideration is over a latent Gaussian field, with the approximation being Gaussian, we show how these approximations can systematically be corrected.",
338
+ "A perturbative expansion is made of the exact but intractable correction, and can be applied to the model's partition function and other moments of interest.",
339
+ "The correction is expressed over the higher-order cumulants which are neglected by EP's local matching of moments.",
340
+ "Through the expansion, we see that EP is correct to first order.",
341
+ "By considering higher orders, corrections of increasing polynomial complexity can be applied to the approximation.",
342
+ "The second order provides a correction in quadratic time, which we apply to an array of Gaussian process and Ising models.",
343
+ "The corrections generalize to arbitrarily complex approximating families, which we illustrate on tree-structured Ising model approximations.",
344
+ "Furthermore, they provide a polynomial-time assessment of the approximation error.",
345
+ "We also provide both theoretical and practical insights on the exactness of the EP solution."
346
+ ]
347
+ },
348
+ {
349
+ "title": "A Bayesian Treatment of Social Links in Recommender Systems ; CU-CS-1092-12",
350
+ "abstract": [
351
+ "Recommender systems are increasingly driving user experiences on the Internet.",
352
+ "This personalization is often achieved through the factorization of a large but sparse observation matrix of user-item feedback signals.",
353
+ "In instances where the user's social network is known, its inclusion can significantly improve recommendations for cold start users.",
354
+ "There are numerous ways in which the network can be incorporated into a probabilistic graphical model.",
355
+ "We propose and investigate two ways for including a social network, either as a Markov Random Field that describes a user similarity in the prior over user features, or an explicit model that treats social links as observations.",
356
+ "State of the art performance is reported on the Flixster online social network dataset."
357
+ ]
358
+ },
359
+ {
360
+ "title": "Collaborative learning of preference rankings",
361
+ "abstract": [
362
+ "We propose a model for learning user preference rankings for the purpose of making product recommendations.",
363
+ "The model allows us to learn from pairwise preference statements or from (incomplete) rankings over more than two items.",
364
+ "We present two algorithms for performing inference in this model, both with excellent scaling in the number of users and items.",
365
+ "The superior predictive performance of the new method is demonstrated on the well-known sushi preference data set.",
366
+ "In addition, we show how the model can be used effectively in an active learning setting where we select only a small number of informative items for learning."
367
+ ]
368
+ },
369
+ {
370
+ "title": "The Xbox recommender system",
371
+ "abstract": [
372
+ "A recent addition to Microsoft's Xbox Live Marketplace is a recommender system which allows users to explore both movies and games in a personalized context.",
373
+ "The system largely relies on implicit feedback, and runs on a large scale, serving tens of millions of daily users.",
374
+ "We describe the system design, and review the core recommendation algorithm."
375
+ ]
376
+ },
377
+ {
378
+ "title": "Transparent user models for personalization",
379
+ "abstract": [
380
+ "Personalization is a ubiquitous phenomenon in our daily online experience.",
381
+ "While such technology is critical for helping us combat the overload of information we face, in many cases, we may not even realize that our results are being tailored to our personal tastes and preferences.",
382
+ "Worse yet, when such a system makes a mistake, we have little recourse to correct it.",
383
+ "\n In this work, we propose a framework for addressing this problem by developing a new user-interpretable feature set upon which to base personalized recommendations.",
384
+ "These features, which we call badges, represent fundamental traits of users (e.g., \"vegetarian\" or \"Apple fanboy\") inferred by modeling the interplay between a user's behavior and self-reported identity.",
385
+ "Specifically, we consider the microblogging site Twitter, where users provide short descriptions of themselves in their profiles, as well as perform actions such as tweeting and retweeting.",
386
+ "Our approach is based on the insight that we can define badges using high precision, low recall rules (e.g., \"Twitter profile contains the phrase 'Apple fanboy'\"), and with enough data, generalize to other users by observing shared behavior.",
387
+ "We develop a fully Bayesian, generative model that describes this interaction, while allowing us to avoid the pitfalls associated with having positive-only data.",
388
+ "\n Experiments on real Twitter data demonstrate the effectiveness of our model at capturing rich and interpretable user traits that can be used to provide transparency for personalization."
389
+ ]
390
+ },
391
+ {
392
+ "title": "Large-scale Ordinal Collaborative Filtering",
393
+ "abstract": [
394
+ "This paper proposes a hierarchical probabilistic model for ordinal matrix factorization by actively modelling the ordinal nature of ranking data, which is typical of large-scale collaborative filtering tasks.",
395
+ "Two algorithms are presented for inference, one based on Gibbs sampling and one based on variational Bayes.",
396
+ "The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies.",
397
+ "The Netflix data set is used for evaluation, which consists of around 100 million ratings.",
398
+ "Using root mean-squared error (RMSE) as an evaluation metric, results show that the suggested model improves similar factorization techniques.",
399
+ "Results also show how Gibbs sampling outperforms variational Bayes on this task, despite the large number of ratings and model parameters."
400
+ ]
401
+ },
402
+ {
403
+ "title": "Vuvuzelas & Active Learning for Online Classification",
404
+ "abstract": [
405
+ "Many online service systems leverage user-generated content from Web 2.0 style platforms such as Wikipedia, Twitter, Facebook, and many more.",
406
+ "Often, the value lies in the freshness of this information (e.g. tweets, event-based articles, blog posts, etc.).",
407
+ "This freshness poses a challenge for supervised learning models as they frequently have to deal with previously unseen features.",
408
+ "In this paper we address the problem of online classification for tweets, namely, how can a classifier be updated in an online manner, so that it can correctly classify the latest \u201chype\u201d on Twitter?",
409
+ "We propose a two-step strategy to solve this problem.",
410
+ "The first step follows an active learning strategy that enables the selection of tweets for which a label would be most useful; the selected tweet is then forwarded to Amazon Mechanical Turk where it is labeled by multiple users.",
411
+ "The second step builds on a Bayesian corroboration model that aggregates the noisy labels provided by the users by taking their reliabilities into account."
412
+ ]
413
+ },
414
+ {
415
+ "title": "Perturbation Corrections in Approximate Inference: Mixture Modelling Applications",
416
+ "abstract": [
417
+ "Bayesian inference is intractable for many interesting models, making deterministic algorithms for approximate inference highly desirable.",
418
+ "Unlike stochastic methods, which are exact in the limit, the accuracy of these approaches cannot be reasonably judged.",
419
+ "In this paper we show how low order perturbation corrections to an expectation-consistent (EC) approximation can provide the necessary tools to ameliorate inference accuracy, and to give an indication of the quality of approximation without having to resort to Monte Carlo methods.",
420
+ "Further comparisons are given with variational Bayes and parallel tempering (PT) combined with thermodynamic integration on a Gaussian mixture model.",
421
+ "To obtain practical results we further generalize PT to temper from arbitrary distributions rather than a prior in Bayesian inference."
422
+ ]
423
+ },
424
+ {
425
+ "title": "Convexity and Bayesian constrained local models",
426
+ "abstract": [
427
+ "The accurate localization of facial features plays a fundamental role in any face recognition pipeline.",
428
+ "Constrained local models (CLM) provide an effective approach to localization by coupling ensembles of local patch detectors for non-rigid object alignment.",
429
+ "A recent improvement has been made by using generic convex quadratic fitting (CQF), which elegantly addresses the CLM warp update by enforcing convexity of the patch response surfaces.",
430
+ "In this paper, CQF is generalized to a Bayesian inference problem, in which it appears as a particular maximum likelihood solution.",
431
+ "The Bayesian viewpoint holds many advantages: for example, the task of feature localization can explicitly build on previous face detection stages, and multiple sets of patch responses can be seamlessly incorporated.",
432
+ "A second contribution of the paper is an analytic solution to finding convex approximations to patch response surfaces, which removes CQF's reliance on a numeric optimizer.",
433
+ "Improvements in feature localization performance are illustrated on the Labeled Faces in the Wild and BioID data sets."
434
+ ]
435
+ },
436
+ {
437
+ "title": "Improving on Expectation Propagation",
438
+ "abstract": [
439
+ "A series of corrections is developed for the fixed points of Expectation Propagation (EP), which is one of the most popular methods for approximate probabilistic inference.",
440
+ "These corrections can lead to improvements of the inference approximation or serve as a sanity check, indicating when EP yields unrealiable results."
441
+ ]
442
+ },
443
+ {
444
+ "title": "Gaussian process modeling for image distortion correction in echo planar imaging",
445
+ "abstract": [
446
+ "An enhanced method for correction of image distortion due to B0\u2010field inhomogeneities in echo planar imaging (EPI) is presented.",
447
+ "The algorithm is based on the measurement of the point spread function (PSF) associated with each image voxel using a reference scan.",
448
+ "The expected distortion map in the phase encode direction is then estimated using a nonparametric inference algorithm known as Gaussian process modeling.",
449
+ "The algorithm is shown to be robust to the presence of regions of low signal\u2010to\u2010noise in the image and large inhomogeneities.",
450
+ "Magn Reson Med, 2008.",
451
+ "\u00a9 2008 Wiley\u2010Liss, Inc."
452
+ ]
453
+ },
454
+ {
455
+ "title": "Empirical Bayesian Change Point Detection",
456
+ "abstract": [
457
+ "This paper explores a Bayesian method for the detection of sudden changes in the generative parameters of a data series.",
458
+ "The problem is phrased as a hidden Markov model, where change point locations correspond to unobserved states, which grow in number with the number of observations.",
459
+ "Our interest lies in the marginal change point posterior density.",
460
+ "Rather than optimize a likelihood function of model parameters, we adapt the Baum-Welch algorithm to maximize a bound on the log marginal likelihood with respect to prior hyperparameters.",
461
+ "This empirical Bayesian approach allows scale-invariance, and can be viewed as an expectation maximization algorithm for hyperparameter optimization in conjugate exponential models with latent variables.",
462
+ "The expectation and maximization steps make respective use of variational and concave-convex inner loops.",
463
+ "A judicious choice of change point prior allows for fast recursive computations on a graphical model.",
464
+ "Results are shown on a number of real-world data sets."
465
+ ]
466
+ },
467
+ {
468
+ "title": "Bayesian inference for latent variable models",
469
+ "abstract": [
470
+ "3 Abstract Bayes\u2019 theorem is the cornerstone of statistical inference.",
471
+ "It provides the tools for dealing with knowledge in an uncertain world, allowing us to explain observed phenomena through the refinement of belief in model parameters.",
472
+ "At the heart of this elegant framework lie intractable integrals, whether in computing an average over some posterior distribution, or in determining the normalizing constant of a distribution.",
473
+ "This thesis examines both deterministic and stochastic methods in which these integrals can be treated.",
474
+ "Of particular interest shall be parametric models where the parameter space can be extended with additional latent variables to get distributions that are easier to handle algorithmically.",
475
+ "Deterministic methods approximate the posterior distribution with a simpler distribution over which the required integrals become tractable.",
476
+ "We derive and examine a new generic \u03b1-divergence message passing scheme for a multivariate mixture of Gaussians, a particular modeling problem requiring latent variables.",
477
+ "This algorithm minimizes local \u03b1-divergences over a chosen posterior factorization, and includes variational Bayes and expectation propagation as special cases.",
478
+ "Stochastic (or Monte Carlo) methods rely on a sample from the posterior to simplify the integration tasks, giving exact estimates in the limit of an infinite sample.",
479
+ "Parallel tempering and thermodynamic integration are introduced as \u2018gold standard\u2019 methods to sample from multimodal posterior distributions and determine normalizing constants.",
480
+ "A parallel tempered approach to sampling from a mixture of Gaussians posterior through Gibbs sampling is derived, and novel methods are introduced to improve the numerical stability of thermodynamic integration.",
481
+ "A full comparison with parallel tempering and thermodynamic integration shows variational Bayes, expectation propagation, and message passing with the Hellinger distance \u03b1 = 12 to be perfectly suitable for model selection, and for approximating the predictive distribution with high accuracy.",
482
+ "Variational and stochastic methods are combined in a novel way to design Markov chain Monte Carlo (MCMC) transition densities, giving a variational transition kernel, which lower bounds an exact transition kernel.",
483
+ "We highlight the general need to mix variational methods with other MCMC moves, by proving that the variational kernel does not necessarily give a geometrically ergodic chain.",
484
+ "Bayes\u2019 theorem is the cornerstone of statistical inference.",
485
+ "It provides the tools for dealing with knowledge in an uncertain world, allowing us to explain observed phenomena through the refinement of belief in model parameters.",
486
+ "At the heart of this elegant framework lie intractable integrals, whether in computing an average over some posterior distribution, or in determining the normalizing constant of a distribution.",
487
+ "This thesis examines both deterministic and stochastic methods in which these integrals can be treated.",
488
+ "Of particular interest shall be parametric models where the parameter space can be extended with additional latent variables to get distributions that are easier to handle algorithmically.",
489
+ "Deterministic methods approximate the posterior distribution with a simpler distribution over which the required integrals become tractable.",
490
+ "We derive and examine a new generic \u03b1-divergence message passing scheme for a multivariate mixture of Gaussians, a particular modeling problem requiring latent variables.",
491
+ "This algorithm minimizes local \u03b1-divergences over a chosen posterior factorization, and includes variational Bayes and expectation propagation as special cases.",
492
+ "Stochastic (or Monte Carlo) methods rely on a sample from the posterior to simplify the integration tasks, giving exact estimates in the limit of an infinite sample.",
493
+ "Parallel tempering and thermodynamic integration are introduced as \u2018gold standard\u2019 methods to sample from multimodal posterior distributions and determine normalizing constants.",
494
+ "A parallel tempered approach to sampling from a mixture of Gaussians posterior through Gibbs sampling is derived, and novel methods are introduced to improve the numerical stability of thermodynamic integration.",
495
+ "A full comparison with parallel tempering and thermodynamic integration shows variational Bayes, expectation propagation, and message passing with the Hellinger distance \u03b1 = 12 to be perfectly suitable for model selection, and for approximating the predictive distribution with high accuracy.",
496
+ "Variational and stochastic methods are combined in a novel way to design Markov chain Monte Carlo (MCMC) transition densities, giving a variational transition kernel, which lower bounds an exact transition kernel.",
497
+ "We highlight the general need to mix variational methods with other MCMC moves, by proving that the variational kernel does not necessarily give a geometrically ergodic chain."
498
+ ]
499
+ }
500
+ ],
501
+ "user_kps": [
502
+ "adaptive games",
503
+ "approximate inference",
504
+ "bayesian probabilistic matrix factorization",
505
+ "collaborative filtering",
506
+ "collaborative filters",
507
+ "deep generative models",
508
+ "epidemic models",
509
+ "face localization",
510
+ "human and machine intelligence",
511
+ "imaging algorithms",
512
+ "item-based collaborative filtering algorithm",
513
+ "latent variable models",
514
+ "personalization",
515
+ "policy learning",
516
+ "probabilistic matrix factorization",
517
+ "recommendation model",
518
+ "relation network",
519
+ "social viewing",
520
+ "stochastic variational inference",
521
+ "variational inference"
522
+ ]
523
+ }