Beyond Discrete Personas: Personality Modeling Through Journal Intensive Conversations

State University of New York at Buffalo, Department of Computer Science and Engineering
Teaser Image

LLMs fine-tuned on our JIC dataset best aligns to the golden annotation capturing Personality traits compared to other models (image shows Ft. on Persona Chat)

Abstract

Large Language Models (LLMs) have significantly improved personalized conversational capabilities. However, existing datasets like Persona Chat, Synthetic Persona Chat, and Blended Skill Talk rely on static, predefined personas. This approach often results in dialogues that fail to capture human personalities' fluid and evolving nature. To overcome these limitations, we introduce a novel dataset with around 400,000 dialogues and a framework for generating personalized conversations using long-form journal entries from Reddit. Our approach clusters journal entries for each author and filters them by selecting the most representative cluster, ensuring that the retained entries best reflect the author's personality. We further refine the data by capturing the Big Five personality traits —openness, conscientiousness, extraversion, agreeableness, and neuroticism —ensuring that dialogues authentically reflect an individual's personality. Using Llama 3 70B, we generate high-quality, personality-rich dialogues grounded in these journal entries. Fine-tuning models on this dataset leads to an 11% improvement in capturing personality traits on average, outperforming existing approaches in generating more coherent and personality-driven dialogues.

Data Generation Process

The synthetic data generation process is outlined in five distinct stages (left side). On the right side, we demonstrate how dialogues are generated from journal entries, highlighting the personality traits they reflect and align with. In Stage 3, where personality trait filtering is introduced, the initial values of the α and β parameters were set to None to allow extensive data generation before further refinement.

Dataset overview

Model Training and Inference Settings

Pipeline overview

Training was conducted in two settings: standard fine-tuning and Retrieval-Augmented Fine-tuning (RAFt). Standard fine-tuning used Low-Rank Adaptation (LoRA) to update specific projection layers, minimizing NLL loss. RAFt enriched inputs by selecting relevant journal segments using Maximum Marginal Relevance (MMR).

Inference also had two settings: utterance-level and Retrieval-Augmented Generation (RAG). RAG distinguished between user statements and questions using a classifier. For questions, it retrieved relevant journal chunks to enhance input context, while for non-questions, no retrieval was performed. This selective approach improved handling of both simple and complex queries, optimizing performance for dialogue generation.

Comparison of dialogues

comparison

Comparison of real and model-generated dialogues capturing personality traits. The Table demonstrates how our best-performing models (LLaMA and Mistral) align with the traits reflected in the original dialogue.

Results

comparison

Performance of LLaMA(left) and Mistral(right) models across various JIC dataset splits. Reported: BLEU, METEOR, ROUGE-L, Avg.

comparison

Performance of LLaMA and Mistral models across various JIC dataset splits. The left panel displays the results for LLaMA, while the right panel shows the results for Mistral

comparison

Personality trait scores across various datasets for the LLaMA 3 8B Instruct model (left) and Mistral 7B Instruct v0.3 (right)

BibTeX

@inproceedings{pal-etal-2025-beyond,
        title = "Beyond Discrete Personas: Personality Modeling Through Journal Intensive Conversations",
        author = "Pal, Sayantan  and
          Das, Souvik  and
          Srihari, Rohini K.",
        editor = "Rambow, Owen  and
          Wanner, Leo  and
          Apidianaki, Marianna  and
          Al-Khalifa, Hend  and
          Eugenio, Barbara Di  and
          Schockaert, Steven",
        booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
        month = jan,
        year = "2025",
        address = "Abu Dhabi, UAE",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/2025.coling-main.470/",
        pages = "7055--7074",
        abstract = "Large Language Models (LLMs) have significantly improved personalized conversational capabilities. However, existing datasets like Persona Chat, Synthetic Persona Chat, and Blended Skill Talk rely on static, predefined personas. This approach often results in dialogues that fail to capture human personalities' fluid and evolving nature. To overcome these limitations, we introduce a novel dataset with around 400,000 dialogues and a framework for generating personalized conversations using long-form journal entries from Reddit. Our approach clusters journal entries for each author and filters them by selecting the most representative cluster, ensuring that the retained entries best reflect the author`s personality. We further refine the data by capturing the Big Five personality traits{---}openness, conscientiousness, extraversion, agreeableness, and neuroticism{---}ensuring that dialogues authentically reflect an individual`s personality. Using Llama 3 70B, we generate high-quality, personality-rich dialogues grounded in these journal entries. Fine-tuning models on this dataset leads to an 11{\%} improvement in capturing personality traits on average, outperforming existing approaches in generating more coherent and personality-driven dialogues."
    }