Respondent Simulation and Construct Validation: A Framework for Developing LLM Privacy Surveys Under Data Scarcity

With the widespread application of Large Language Models (LLMs), the privacy and security issues arising from LLMs have gradually received widespread attention. Existing studies generally face the problem of difficult access to real user data, which hinders the effective development and validation o...

Full description

Saved in:
Bibliographic Details
Published in2025 IEEE International Conference on Intelligence and Security Informatics (ISI) pp. 70 - 76
Main Authors Meng, Xiuzhe, Li, Lexin
Format Conference Proceeding
LanguageEnglish
Published IEEE 12.07.2025
Subjects
Online AccessGet full text
ISSN2837-6617
DOI10.1109/ISI65680.2025.11201143

Cover

Abstract With the widespread application of Large Language Models (LLMs), the privacy and security issues arising from LLMs have gradually received widespread attention. Existing studies generally face the problem of difficult access to real user data, which hinders the effective development and validation of user privacy concern scales. This study aims to propose a structured, intelligent, and theoretically robust method to rapidly generate the LLM privacy concern scale and validate its effectiveness. The study first combines the theoretical framework of structural equation modeling (SEM) with the BRASS instruction-driven mechanism to construct a set of instruction tools for automatic generation of questionnaire items, followed by the construction of high ecological validity simulated subject group characteristics using social psychology and risk perception theories. Based on the theory-driven scoring function and path model, the study further generates simulated response data and embeds a human-computer collaborative mechanism of LLM self-checking and expert review to ensure the quality and theoretical consistency of the generated data. Preliminary validation results using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) show that the generated simulated data and scales have good internal consistency and structural validity. The results confirm that the method is capable of providing rapid and cost-effective theoretical conceptual validation in the absence of real data, which improves the efficiency and methodological rigor of research in the area of LLM privacy and security, and provides an important tool to support subsequent large-scale empirical research.
AbstractList With the widespread application of Large Language Models (LLMs), the privacy and security issues arising from LLMs have gradually received widespread attention. Existing studies generally face the problem of difficult access to real user data, which hinders the effective development and validation of user privacy concern scales. This study aims to propose a structured, intelligent, and theoretically robust method to rapidly generate the LLM privacy concern scale and validate its effectiveness. The study first combines the theoretical framework of structural equation modeling (SEM) with the BRASS instruction-driven mechanism to construct a set of instruction tools for automatic generation of questionnaire items, followed by the construction of high ecological validity simulated subject group characteristics using social psychology and risk perception theories. Based on the theory-driven scoring function and path model, the study further generates simulated response data and embeds a human-computer collaborative mechanism of LLM self-checking and expert review to ensure the quality and theoretical consistency of the generated data. Preliminary validation results using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) show that the generated simulated data and scales have good internal consistency and structural validity. The results confirm that the method is capable of providing rapid and cost-effective theoretical conceptual validation in the absence of real data, which improves the efficiency and methodological rigor of research in the area of LLM privacy and security, and provides an important tool to support subsequent large-scale empirical research.
Author Li, Lexin
Meng, Xiuzhe
Author_xml – sequence: 1
  givenname: Xiuzhe
  surname: Meng
  fullname: Meng, Xiuzhe
  email: xiuzhe.meng@ia.ac.cn
  organization: Tianjin University,Department of Management and Economics,Tianjin,China
– sequence: 2
  givenname: Lexin
  surname: Li
  fullname: Li, Lexin
  email: l12lxn@nefu.edu.cn
  organization: School of Computer and Control Engineering, Northeast Forestry University,Harbin,China
BookMark eNqFjs1Kw0AUhUfRRdW-gch9gdb5MZPEnVSLhRaKUbflktzKYDITbiaRvL1BdO3qwPk-DudCnPngSYgbJZdKyfx2U2xsYjO51FInU6WlUnfmRMzzNM-MUYnSqU1PxUxnJl1Yq9KZ4Bfq2uAr8hEK1_Q1Rhc8oK9gFXwXuS8jvGPtqh9wDw-wZmzoK_AnHAPDIw1Uh9b5D9hud7BnN2A5QtHzQGMHb9P0JGFEKErk0sXxSpwfse5o_puX4nr99Lp6XjgiOrTsGuTx8Hff_IO_AWdBTRE
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ISI65680.2025.11201143
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplorer
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Psychology
EISBN 9798331512767
EISSN 2837-6617
EndPage 76
ExternalDocumentID 11201143
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 72293575
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-ieee_primary_112011433
IEDL.DBID RIE
IngestDate Wed Oct 29 06:13:01 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-ieee_primary_112011433
ParticipantIDs ieee_primary_11201143
PublicationCentury 2000
PublicationDate 2025-July-12
PublicationDateYYYYMMDD 2025-07-12
PublicationDate_xml – month: 07
  year: 2025
  text: 2025-July-12
  day: 12
PublicationDecade 2020
PublicationTitle 2025 IEEE International Conference on Intelligence and Security Informatics (ISI)
PublicationTitleAbbrev ISI
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
Score 3.8359911
Snippet With the widespread application of Large Language Models (LLMs), the privacy and security issues arising from LLMs have gradually received widespread...
SourceID ieee
SourceType Publisher
StartPage 70
SubjectTerms Data collection
Data models
Data privacy
Large-scale language modeling
Mathematical models
Privacy
privacy security
Psychology
questionnaire generation
Reliability theory
Security
simulated data
Solids
Surveys
Title Respondent Simulation and Construct Validation: A Framework for Developing LLM Privacy Surveys Under Data Scarcity
URI https://ieeexplore.ieee.org/document/11201143
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB5sTz35qvioMgevSfNquvEm1tJKW8So9FaSzQSKmJaSFuqvd3bTRBQFb0tYsrsM7HyzM998ANdOakmOZS0jVR0IvUSmRtAl30i7Hckej0SsuTDjiT948R6mnemOrK65MESki8_IVEOdy08Wcq2eytqMDRR-d2tQ6wq_IGvtWL-2FbSH4ZDRibA46nM6Zjn5m2yK9hr9fZiU6xXFIm_mOo9N-fGjFeO_N3QAzS-CHj5WrucQ9ig7gkZ1mW2PYfWki18VDRfD-ftOpAujLEGl0am7xuIrg_BCU-kGb7Ff1mkhA1nsVWwqHI3GvNp8E8kthuvVho2PWi8Je1EeocrhSAbzTWj175_vBoY6wWxZtLGYlZt3T6CeLTI6BYwTiq2AMYuwyUuIhEpuUyzI5UglEvIMmr_-4vyP7xfQULZQb6G204I6H44u2Ynn8ZU23idXCqLu
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4oHuTkq8YH6hy8thTYQvFmRFK0ECNouJF2O00IsRgCJPjrnd0-jEYTb5tm093NJDvf7Mw3H8B1PbYlx7K2GasOhCKSsdluUdOMW45kj0duqLkw_UHTexEPY2eckdU1F4aIdPEZWWqoc_nRXK7UU1mVsYHC741t2HGEEE5K18p4vzW7Xe0Ne4xPXJvjvrpj5dO_Cadov9Hdg0G-YlouMrNWy9CSHz-aMf57S_tgfFH08KlwPgewRckhlIvrbHMEi2dd_qqIuDicvmUyXRgkESqVTt03Fl8ZhqeqSjd4i928UgsZymKn4FOh7_d5tek6kBscrhZrNj9qxSTsBMsAVRZHMpw3oNK9H915pjrB5D1tZDHJN984hlIyT-gEMIwotNuMWtwaiYjIVeltCl1qcKwSuPIUjF9_cfbH9yvY9UZ9f-L3Bo_nUFZ2US-jtXoFSnxQumCXvgwvtSE_AQg1pjs
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+IEEE+International+Conference+on+Intelligence+and+Security+Informatics+%28ISI%29&rft.atitle=Respondent+Simulation+and+Construct+Validation%3A+A+Framework+for+Developing+LLM+Privacy+Surveys+Under+Data+Scarcity&rft.au=Meng%2C+Xiuzhe&rft.au=Li%2C+Lexin&rft.date=2025-07-12&rft.pub=IEEE&rft.eissn=2837-6617&rft.spage=70&rft.epage=76&rft_id=info:doi/10.1109%2FISI65680.2025.11201143&rft.externalDocID=11201143