Language-based digital twins using LLMs proposed for elderly cognitive health monitoring
Researchers propose a framework that uses large language models to construct digital twins of elderly individuals by mimicking their conversational patterns and stylometric cues, enabling continuous, non-invasive monitoring for Mild Cognitive Impairment. A multi-head conditional variational autoencoder (cVAE) is introduced to evaluate fidelity and predict cognitive scores (MoCA). Experiments on the I-CONECT dataset show the approach preserves individual identity characteristics and outperforms baseline GPT-generated responses on reconstruction and cognitive score prediction. The work positions language-based digital twins as a scalable alternative to clinical cognitive assessment.
Related guides (1)
Related events (8)
Forgotten Words: Benchmarking NeoBERT for Dementia Detection in Low-Resource Conversational Filipino and English Speech
This paper presents the first NLP-based dementia detection study for Filipino speech, constructing a parallel bilingual dataset of 4,000 DementiaBank-derived transcripts with manual Filipino translations. Five model families are evaluated across monolingual, zero-shot cross-lingual, and bilingual fine-tuning settings. English-trained BERT degrades sharply on Filipino (Macro-F1 = 0.455), but bilingual fine-tuning recovers performance to Macro-F1 = 0.969–0.973 across all transformer models. The key finding is that multilingual clinical NLP performance is driven by linguistic coverage during training rather than model scale or architecture.
LLMs predict dementia and depression severity from clinical interview transcripts in zero-shot and feature-extraction settings
Researchers evaluate three open-weights LLMs (Mistral 3.1, DeepHermes, Qwen3) for predicting dementia and depression severity from speech transcripts of 154 German-speaking patients in standardized clinical interviews. The study introduces a new observer-based Global Depression Scale (GDS-D) and tests both zero-shot prediction and LLM-based feature extraction for Support Vector Regression. Zero-shot performs well for depression (MAE 0.60), while structured feature extraction reduces dementia assessment error by up to 35%; pause-enriched automatic transcripts match human transcription quality, suggesting viable fully-automated screening pipelines.
Multi-stage explainability framework translates transformer speech models into clinical cognitive impairment narratives
A new arXiv preprint proposes a framework for making transformer-based speech cognitive impairment detection clinically interpretable by combining SHAP token attribution, linguistic feature analysis, and a four-stage LLM reasoning pipeline using LLaMA-3.1-70B-Instruct. The system is built on the SpeechCARE-Adaptive Gating Network multimodal model (F1=72.11% on NIA PREPARE) and maps outputs to four cognitive-linguistic dimensions. Physician evaluation on 70 samples showed strong alignment with clinical profiles and a System Usability Scale score of 82/100, suggesting practical clinical workflow integration potential.
A Sleep-Like Consolidation Mechanism for LLMs
A preprint on arXiv proposes a sleep-like memory consolidation mechanism for large language models, drawing an analogy to biological sleep-based memory consolidation in neural systems. The work appears to address how LLMs might better retain and integrate new information over time, a key challenge in continual learning and knowledge updating. The paper attracted notable community attention on Hacker News with 164 points and 122 comments, suggesting broad interest in the approach.
Study identifies 'synthetic lived experience paradox' in peer-like AI caregiver support
Researchers examine how LLMs prompted to sound peer-like generate language implying lived experience they cannot authentically possess, studying this in the context of family caregivers of Alzheimer's/ADRD patients. Using caregiver support exchanges from online communities and responses from LLaMA, GPT-4o-mini, and MedGemma, the study finds a 'narrative authenticity gap': AI captures emotional work of peer support but can fabricate experiential grounding. Psycholinguistic analysis shows human peers use significantly more first-person and past-focused language than AI. The authors argue caregiver-support AI needs mechanisms to distinguish supportive framing from fabricated lived experience.
Clinically grounded privacy evaluation framework reveals high memorization risk in medical LMs
Researchers introduce a tiered adversarial framework for evaluating privacy leakage in medical language models, moving beyond simple training-text recovery to realistic clinical threat models. Applied to an LM pretrained on 378k clinical notes, the framework finds that routine encounter metadata (name, DOB, provider, visit date) elicits high verbatim memorization and sensitive-diagnosis recovery (AUROC 0.91 for abortion, 0.81 for HIV). The study also finds that exact-match memorization overstates disclosure risk because 36% of memorized tokens reflect templated documentation. The work provides a practical contextual privacy evaluation methodology for medical LMs trained on longitudinal patient data.
Systematic evaluation of multi-personality conditioning and dynamic switching in vision-language models
This paper introduces explicit personality conditioning for multimodal large language models (MLLMs) and proposes an evaluation framework covering single-personality induction, multi-personality composition, and dynamic personality switching. Experiments reveal that personality induction improves image captioning but degrades performance on precise reasoning tasks like VQA. The authors find balancing and residual effects during multi-trait composition and switching, and show that existing prompt-based personality induction methods transfer poorly to multimodal settings.
VisualMem: Personal Visual Memory Benchmark and Architecture for Personalized AI Agents
This paper introduces a benchmark and hybrid architecture (VisualMem) for personal visual memory in long-term AI agent memory systems. The work addresses a gap in existing text-centric memory systems by capturing both explicit evidence (recurring user-associated entities) and implicit evidence (latent user facts from visual/multimodal cues) from images. VisualMem augments a text-memory backend with a structured personal visual memory module that uses conversational context to resolve identity, ownership, and durable user facts. Experiments show VisualMem substantially outperforms prior memory systems on the new benchmark while remaining competitive on standard text-memory benchmarks.
