Information extraction methods turn unstructured or semi-structured text into structured knowledge usable in downstream applications. Our lab's work has focused on joint extraction of entities and relationships from complex texts such as scientific publications, as well as from documents with rich layout features such as semi-structured webpages.
What's In My Big Data?
Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge
ICLR 2024
Project page
PDF
Source code
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study
Rahul Nadkarni, David Wadden, Iz Beltagy, Noah A. Smith, Hannaneh Hajishirzi, Tom Hope
AKBC 2021
PDF
Semantic scholar
Source code
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto
EMNLP 2020
PDF
Semantic scholar
Source code
Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia
Ikuya Yamada, Akari Asai, Jin Sakuma, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji, Yuji Matsumoto
EMNLP 2020 (system demonstrations)
Project page
PDF
Semantic scholar
Demo
Source code
Fact or Fiction: Verifying Scientific Claims
David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, Hannaneh Hajishirzi
EMNLP 2020
PDF
Semantic scholar
Demo
Source code
SciREX: A Challenge Dataset for Document-Level Information Extraction
Sarthak Jain, Madeleine van Zuylen, Hannaneh Hajishirzi, Iz Beltagy
ACL 2020
Dataset
PDF
Source code
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages
Colin Lockard, Prashant Shiralkar, Xin Luna Dong, Hannaneh Hajishirzi
ACL 2020
PDF
Entity, Relation, and Event Extraction with Contextualized Span Representations
David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi
EMNLP 2019
PDF
Semantic scholar
Source code
A General Framework for Information Extraction using Dynamic Span Graphs
Yi Luan, David Wadden, Amy Shah, Mari Ostendorf, Hannaneh Hajishirzi
NAACL 2019
PDF
Source code
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi
EMNLP 2018
Project page
PDF
Source code
Ceres: Distantly Supervised Relation Extraction from the Semi-Structured Web
Colin Lockard, Xin Luna Dong, Arash Einolghozati, Prashant Shiralkar
VLDB 2018
PDF
Semi-Supervised Event Extraction with Paraphrase Clusters
James Ferguson, Colin Lockard, Daniel S. Weld, Hannaneh Hajishirzi
NAACL 2018
PDF
UW system at SemEval-2018 Task 7: Neural Relation Extraction Model with Selectively Incorporated Concept Embeddings
Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi
SemEval, 2018
PDF
Scientific Information Extraction with Semi-supervised Neural Tagging
Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi
EMNLP 2017
Project page
PDF