Jun Araki

I am a Lead Scientist at Bosch Research in Sunnyvale, CA, USA. I work in the areas of natural language processing, machine learning, and knowledge representation and reasoning. More specifically, I am interested in the research problems of information extraction, question answering, question generation, and knowledge base construction and utilization. Broadly speaking, I am interested in theoretical foundations and practical algorithms to approach the meaning of natural language computationally. Previously, I obtained a Ph.D. in the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University. My Ph.D. study focused on learnable models for capturing event semantics and structures.

News

2023-10-09: Our paper on a dataset for hallucination detection was accepted to the Findings of EMNLP 2023.
2023-06-20: I accepted an invitation to serve as an Area Chair for the Information Extraction track at EMNLP 2023.
2023-06-05: I accepted an invitation to serve as a Senior Program Committee Member (Meta-Reviewer) at AAAI 2024.
2023-05-03: I accepted an invitation to serve as an Area Chair for the Information Extraction track at IJCNLP-AACL 2023.
2023-05-02: Our paper on a dataset for situated proactive response selection was accepted to ACL 2023.
2023-05-02: Our paper on co-augmentation with self-training and rule augmentation was accepted to the Findings of ACL 2023.
2023-02-12: I served as a Session Chair at two sessions in the technical program of AAAI 2023.
2023-01-24: Our paper on language model prompting in low-resource domains was accepted to EACL 2023.
2022-10-09: Our paper on retrieval attention in open-domain question answering was accepted to EMNLP 2022.
2022-08-17: Our paper on multi-hop reasoning in generative question answering was accepted to COLING 2022.

Research Interests

My research interests are in computational semantics, statistical natural language processing, and knowledge representation and reasoning. My general research questions are as follows:

How can we make computers construct general-purpose and domain-specific semantic resources automatically from natural language sources with the aim of semantically effective natural language processing?
How can we make computers utilize these resources in semantically oriented natural language processing tasks such as information extraction, dialog management, and question answering?

Education

Ph.D. in Language and Information Technology, Carnegie Mellon University, August 2012 to August 2018 (certified in July 2018)
Thesis: Extraction of Event Structures from Text
M.S. in Computer Science, Stanford University, September 2009 to June 2011
Specialization: Artificial Intelligence
M.Eng. in Electronic Engineering, The University of Tokyo, April 2001 to March 2003
Thesis: Text Classification with a Polysemy Considered Feature Set
B.Eng. in Information and Communication Engineering, The University of Tokyo, April 1997 to March 2001

Selected Honors and Awards

Outstanding Reviewer at EMNLP, 2020 and 2021
IBM Ph.D. Fellowship, 2015 to 2017
Graduate Research Fellowship, Carnegie Mellon University, 2012 to 2018
Funai Overseas Scholarship, Funai Foundation for Information Technology, 2012 to 2014

Jun Araki

News

Research Interests

Education

Selected Honors and Awards

Miscellaneous