Comparing Pre-Trained Human Language Models: Is It Better with Human Context as Groups, Individual Traits, or Both?
-
2024/08/15
-
Details
-
Personal Author:
-
Description:Pre-trained language models consider the context of neighboring words and documents but lack any author context of the human generating the text. However, language depends on the author's states, traits, social, situational, and environmental attributes, collectively referred to as human context (Soni et al., 2024). Human-centered natural language processing requires incorporating human context into language models. Currently, two methods exist: pre-training with 1) group-wise attributes (e.g., over-45-year-olds) or 2) individual traits. Group attributes are simple but coarse - not all 45-year-olds write the same way - while individual traits allow for more personalized representations, but require more complex modeling and data. It is unclear which approach benefits what tasks. We compare pre-training models with human context via 1) group attributes, 2) individual users, and 3) a combined approach on five user- and document-level tasks. Our results show that there is no best approach, but that human-centered language modeling holds avenues for different methods. [Description provided by NIOSH]
-
Subjects:
-
Keywords:
-
ISBN:9798891761568
-
Publisher:
-
Document Type:
-
Funding:
-
Genre:
-
Place as Subject:
-
CIO:
-
Topic:
-
Location:
-
Pages in Document:316-328
-
NIOSHTIC Number:nn:20070182
-
Citation:WASSA 2024: Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, August 15, 2024, Bangkok, Thailand. De Clercq O, Barriere V, Barnes J, Klinger R, Sedoc J, Tafreshi S, eds. Stroudsburg, PA: Association for Computational Linguistics (ACL), 2024 Aug; :316-328
-
Editor(s):
-
Federal Fiscal Year:2024
-
Performing Organization:State University of New York, Stony Brook
-
Peer Reviewed:False
-
Start Date:20220701
-
Source Full Name:WASSA 2024: Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, August 15, 2024, Bangkok, Thailand
-
End Date:20260630
-
Collection(s):
-
Main Document Checksum:urn:sha-512:e8b3e2fff2d44371f1718ad1c0e3906f4b5bfb89824d12dbcf63ab61f7755f7a183f8e14aa0eefe8d616a6ed7d75e78384a643f44f335e9b64f1edd5a0e6585d
-
Download URL:
-
File Type:
ON THIS PAGE
CDC STACKS serves as an archival repository of CDC-published products including
scientific findings,
journal articles, guidelines, recommendations, or other public health information authored or
co-authored by CDC or funded partners.
As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.
As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.
You May Also Like