Entity · model

BioLlama3

modelactivebiollama3-0c24c1ed·1 events·first seen Jun 8, 2026

Aliases: BioLlama3

Co-occurring entities

BioBERT MedMCQA ClinicalBERT GPT-3.5 Llama 3

More like this (12)

Llama-3 TinyLlama Llama 3 Llama3-8B TinyLlama-1.1B Llama 3.2 Llama-3.1-8B Meta Llama 3.1 405B Llama 1B Llama Meta Llama Llama-3.2-1B-Instruct

Recent events (1)

5arXiv · cs.CL·Jun 8, 2026·source ↗

Systematic evaluation of LLM prompt sensitivity in healthcare settings reveals safety risks

Researchers conduct a sensitivity analysis of both general-purpose and medical-specific LLMs using the MedMCQA benchmark, testing robustness to lexical and syntactic prompt perturbations. The study finds that even minor phrasing changes can alter clinical advice, and adversarial prompts can produce dangerous outputs such as incorrect dosages or omitted critical findings. Both general-purpose models (GPT-3.5, Llama 3) and domain-specific models (ClinicalBERT, BioLlama3, BioBERT) exhibit this fragility, with syntactic reordering and misleading contextual cues proving more destabilizing than simple paraphrasing.

Evaluation and Benchmarking AI Safety Research BioLlama3 BioBERT MedMCQA +3 more