Empathy List Archives

sigdial@list.sigdial.org

SIGdial Mailing List

Call for papers 4th Workshop on Scholarly Document Processing - SDP@ACL 2024

Tirthankar Ghosal

Tue, Feb 20, 2024 4:59 PM

** Call for Research Papers **

Scholarly literature is the chief means by which scientists and academics
document and communicate their results and is therefore critical to the
advancement of knowledge and improvement of human well-being. At the same
time, this literature poses challenges to NLP uncommon in other genres,
such as specialized language and high background knowledge requirements,
long documents and strong structural conventions, multimodal presentation,
citation relationships among documents, an emphasis on rational
argumentation, and the frequent availability of detailed metadata. These
challenges necessitate the development of NLP methods and resources
optimized for this domain. The Scholarly Document Processing (SDP) workshop
provides a venue for discussing these challenges, bringing together
stakeholders from different communities including computational
linguistics, machine learning, text mining, information retrieval, digital
libraries, scientometrics and others, to develop methods, tasks, and
resources in support of these goals.

This workshop builds on the success of prior workshops: the 1st, 2nd, and
3rd SDP workshops held at EMNLP 2020, NAACL 2021, and COLING 2022, and the
1st and 2nd SciNLP workshops held at AKBC 2020 and 2021. In addition to
having broad appeal within the NLP community, we hope the SDP workshop will
attract researchers from other relevant fields including meta-science,
scientometrics, data mining, information retrieval, and digital libraries,
bringing together these disparate communities within ACL.

Website: https://sdproc.org/2024/

X (Twitter): https://twitter.com/sdpworkshop

Topics of Interest

We invite submissions from all communities demonstrating usage of and
challenges associated with natural language processing, information
retrieval, and data mining of scholarly and scientific documents. Relevant
topics include (but are not limited to):

Large Language Models (LLMs) for Science

Representation learning and language modeling

Information extraction and NER

Document understanding

Summarization and generation

Question-answering

Discourse modeling/argumentation mining

Network analysis

Bibliometrics, scientometrics, and altmetrics

Reproducibility and research integrity, including new challenges posed
by generative AI

Peer review tools, principles and technology

Metadata and indexing

Inclusion of datasets and computational resources

Research infrastructures and digital libraries

Increasing the representation in scholarly work of disadvantaged
populations

LLM-based interfaces to consume/produce scholarly documents

** Submission Information **

Authors are invited to submit full and short papers with unpublished,
original work. Submissions will be subject to a double-blind peer-review
process. Accepted papers will be presented by the authors at the workshop
either as a talk or a poster. All accepted papers will be published in the
workshop proceedings (proceedings from previous years can be found here:
https://aclanthology.org/venues/sdp/).

The submissions must be in PDF format and anonymized for review. All
submissions must be written in English and follow the ACL 2024 formatting
requirements:

Long paper submissions: up to 8 pages of content, plus unlimited references.

Short paper submissions: up to 4 pages of content, plus unlimited
references.

Submission Website: Paper submission has to be done through openreview: <
https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/SDProc>

Final versions of accepted papers will be allowed 1 additional page of
content so that reviewer comments can be taken into account.

** Important Dates (Main Research Track) **

Paper submission deadline: May 17 (Friday), 2024

Notification of acceptance: June 17 (Monday), 2024

Camera-ready paper due: July 1 (Monday), 2024

Workshop dates: August 16, 2024

** SDP 2024 Keynote Speakers **

We are excited to have several keynote speakers at SDP 2024.

Iryna Gurevych, Professor at Technical University Darmstadt and head of
the UKP Lab, Germany.
2.

Anna Rogers, Assistant Professor, University of Copenhagen, Denmark
3.

Heng Ji, Professor, University of Illinois at Urbana-Champaign, USA.
4.

Doug Downey, Associate Professor at Northwestern University and Research
Manager at Allen Institute for AI, USA.

** SDP 2024 Shared Tasks **

SDP 2024 will host two exciting shared tasks. More information about all
shared tasks is provided on the workshop website:
https://sdproc.org/2024/sharedtasks.html

DAGPap24: Detecting automatically generated scientific papers

A big problem with the ubiquity of Generative AI is that it has now become
very easy to generate fake scientific papers. This can erode public trust
in science and attack the foundations of science: are we standing on the
shoulders of robots? The Detecting Automatically Generated Papers (DAGPAP)
competition aims to encourage the development of robust, reliable
AI-generated scientific text detection systems, utilizing a diverse dataset
and varied machine learning models in a number of scientific domains.

Organizers: Savvas Chamezopoulos, Yury Kashnitsky, Drahomira Herrmannova,
Anita de Waard (Elsevier), Domenic Rosati (Scite)

Context24: Contextualizing Scientific Figures and Tables

When making sense of results across many research papers on a topic,
figures or tables of key results from the papers can serve as effective,
information-dense summaries that can be compared/contrasted and synthesized
with other results. However, to understand the results, key elements (e.g.,
measures, sample) need to be contextualized with associated methodological
details, which are typically dispersed throughout the text, often far from
the figure/table and from each other. In this shared task, we are
interested in contextualizing scientific figures and tables, i.e.,
automatically retrieving and ranking snippets from the paper that are most
needed to interpret their results, with the goal of making figures/tables
more self-contained.

Organizers: Joel Chan, Matthew Akamatsu

** Organizing Committee **

Tirthankar Ghosal, Oak Ridge National Laboratory, USA

Philipp Mayr, GESIS – Leibniz Institute for the Social Sciences, Germany

Aakanksha Naik, Allen Institute for AI, USA

Shannon Shen, Massachusetts Institute of Technology, USA

Amanpreet Singh, Allen Institute for AI, USA

Anita de Waard, Elsevier, Netherlands

Orion Weller, Johns Hopkins University, USA

Yanxia Qin, National University of Singapore, Singapore

Yoonjoo Lee, Korea Advanced Institute of Science & Technology, South Korea

+++++++++++++++++++++++++++++++++++

Tirthankar Ghosal

Scientist

National Center for Computational Sciences (NCCS)

Oak Ridge National Laboratory, United States

++++++++++++++++++++++++++++++++++++

** Call for Research Papers ** Scholarly literature is the chief means by which scientists and academics document and communicate their results and is therefore critical to the advancement of knowledge and improvement of human well-being. At the same time, this literature poses challenges to NLP uncommon in other genres, such as specialized language and high background knowledge requirements, long documents and strong structural conventions, multimodal presentation, citation relationships among documents, an emphasis on rational argumentation, and the frequent availability of detailed metadata. These challenges necessitate the development of NLP methods and resources optimized for this domain. The Scholarly Document Processing (SDP) workshop provides a venue for discussing these challenges, bringing together stakeholders from different communities including computational linguistics, machine learning, text mining, information retrieval, digital libraries, scientometrics and others, to develop methods, tasks, and resources in support of these goals. This workshop builds on the success of prior workshops: the 1st, 2nd, and 3rd SDP workshops held at EMNLP 2020, NAACL 2021, and COLING 2022, and the 1st and 2nd SciNLP workshops held at AKBC 2020 and 2021. In addition to having broad appeal within the NLP community, we hope the SDP workshop will attract researchers from other relevant fields including meta-science, scientometrics, data mining, information retrieval, and digital libraries, bringing together these disparate communities within ACL. Website: https://sdproc.org/2024/ X (Twitter): https://twitter.com/sdpworkshop Topics of Interest We invite submissions from all communities demonstrating usage of and challenges associated with natural language processing, information retrieval, and data mining of scholarly and scientific documents. Relevant topics include (but are not limited to): - Large Language Models (LLMs) for Science - Representation learning and language modeling - Information extraction and NER - Document understanding - Summarization and generation - Question-answering - Discourse modeling/argumentation mining - Network analysis - Bibliometrics, scientometrics, and altmetrics - Reproducibility and research integrity, including new challenges posed by generative AI - Peer review tools, principles and technology - Metadata and indexing - Inclusion of datasets and computational resources - Research infrastructures and digital libraries - Increasing the representation in scholarly work of disadvantaged populations - LLM-based interfaces to consume/produce scholarly documents ** Submission Information ** Authors are invited to submit full and short papers with unpublished, original work. Submissions will be subject to a double-blind peer-review process. Accepted papers will be presented by the authors at the workshop either as a talk or a poster. All accepted papers will be published in the workshop proceedings (proceedings from previous years can be found here: https://aclanthology.org/venues/sdp/). The submissions must be in PDF format and anonymized for review. All submissions must be written in English and follow the ACL 2024 formatting requirements: Long paper submissions: up to 8 pages of content, plus unlimited references. Short paper submissions: up to 4 pages of content, plus unlimited references. Submission Website: Paper submission has to be done through openreview: < https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/SDProc> Final versions of accepted papers will be allowed 1 additional page of content so that reviewer comments can be taken into account. ** Important Dates (Main Research Track) ** Paper submission deadline: May 17 (Friday), 2024 Notification of acceptance: June 17 (Monday), 2024 Camera-ready paper due: July 1 (Monday), 2024 Workshop dates: August 16, 2024 ** SDP 2024 Keynote Speakers ** We are excited to have several keynote speakers at SDP 2024. 1. Iryna Gurevych, Professor at Technical University Darmstadt and head of the UKP Lab, Germany. 2. Anna Rogers, Assistant Professor, University of Copenhagen, Denmark 3. Heng Ji, Professor, University of Illinois at Urbana-Champaign, USA. 4. Doug Downey, Associate Professor at Northwestern University and Research Manager at Allen Institute for AI, USA. ** SDP 2024 Shared Tasks ** SDP 2024 will host two exciting shared tasks. More information about all shared tasks is provided on the workshop website: https://sdproc.org/2024/sharedtasks.html DAGPap24: Detecting automatically generated scientific papers A big problem with the ubiquity of Generative AI is that it has now become very easy to generate fake scientific papers. This can erode public trust in science and attack the foundations of science: are we standing on the shoulders of robots? The Detecting Automatically Generated Papers (DAGPAP) competition aims to encourage the development of robust, reliable AI-generated scientific text detection systems, utilizing a diverse dataset and varied machine learning models in a number of scientific domains. Organizers: Savvas Chamezopoulos, Yury Kashnitsky, Drahomira Herrmannova, Anita de Waard (Elsevier), Domenic Rosati (Scite) Context24: Contextualizing Scientific Figures and Tables When making sense of results across many research papers on a topic, figures or tables of key results from the papers can serve as effective, information-dense summaries that can be compared/contrasted and synthesized with other results. However, to understand the results, key elements (e.g., measures, sample) need to be contextualized with associated methodological details, which are typically dispersed throughout the text, often far from the figure/table and from each other. In this shared task, we are interested in contextualizing scientific figures and tables, i.e., automatically retrieving and ranking snippets from the paper that are most needed to interpret their results, with the goal of making figures/tables more self-contained. Organizers: Joel Chan, Matthew Akamatsu ** Organizing Committee ** Tirthankar Ghosal, Oak Ridge National Laboratory, USA Philipp Mayr, GESIS – Leibniz Institute for the Social Sciences, Germany Aakanksha Naik, Allen Institute for AI, USA Shannon Shen, Massachusetts Institute of Technology, USA Amanpreet Singh, Allen Institute for AI, USA Anita de Waard, Elsevier, Netherlands Orion Weller, Johns Hopkins University, USA Yanxia Qin, National University of Singapore, Singapore Yoonjoo Lee, Korea Advanced Institute of Science & Technology, South Korea -- +++++++++++++++++++++++++++++++++++ *Tirthankar Ghosal* Scientist National Center for Computational Sciences (NCCS) Oak Ridge National Laboratory, United States ++++++++++++++++++++++++++++++++++++