颈部单中心型Castleman病临床辅助诊疗中ChatGPT o1和Claude 3.5 Sonnet的应用比较研究
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

R739.91

基金项目:

国家自然科学基金地区科学基金项目(82260475);甘肃省卫生健康行业科技创新重大行业项目(GSWSZD2024-02); 甘肃省科技计划重点研发项目(25YFWA028)。


A comparative study on the application of ChatGPT o1 and Claude 3.5 Sonnet in the clinical adjuvant diagnosis and treatment of unicentric Castleman disease in the neck
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目的 研究并对比分析ChatGPT o1与Claude 3.5 Sonnet在解答颈部单中心Castleman病相关常见问题时的差异。方法 围绕颈部单中心Castleman病设计36个常见问题,收集并输入到ChatGPT o1与Claude 3.5 Sonnet搜索引擎中,由耳鼻咽喉头颈外科学教授分别对ChatGPT o1和Claude 3.5 Sonnet生成的回答进行独立评估,评估内容涵盖回答内容的可读性、准确性、质量、易理解程度以及实际可操作性。结果 在可读性方面,Claude 3.5 Sonnet在所有类别中生成的回答字数更简短(189.36±69.09 vs. 381.56±153.28,P<0.05),具有更低的阅读分数(1.68±5.64 vs. 11.20±11.16,P<0.05),年级分数更高(54.93±35.81 vs. 16.70±2.03,P<0.05)。在患者教育材料评估工具(PEMAT-P)评分衡量的可理解性和可操作性方面,Claude 3.5 Sonnet表现出更高的总体可理解性(0.38±0.17 vs. 0.06±0.05,P<0.05)和可操作性(0.25±0.22 vs. 0.08±0.09,P=0.015)。然而,ChatGPT o1的总体准确度分数更高(4.88±0.28 vs. 4.58±0.37,P=0.002 2),并在修改后的基于证据的患者教育信息质量评估工具(EQIP)标准下获得了更好的质量分数(7.47±1.28 vs. 5.75±1.20,P<0.05)。结论 Claude 3.5 Sonnet在简洁性、可理解性和可操作性方面占优势,而ChatGPT o1在准确性、整体质量和可读性上更胜一筹。

    Abstract:

    Objective To study and compare the differences between ChatGPT o1 and Claude 3.5 Sonnet in answering common questions related to unicentric Castleman disease in the neck. Methods A total of 36 common questions related to unicentric Castleman disease of the neck were designed and input into the ChatGPT o1 and Claude 3.5 Sonnet search engines. The answers generated by ChatGPT o1 and Claude 3.5 Sonnet were independently evaluated by a professor of otolaryngology-head and neck surgery. The evaluation contents included the readability, accuracy, quality, understandability, and practical operability. Results Regarding readability, Claude 3.5 Sonnet generated significantly more concise responses across all categories (189.36±69.09 vs. 381.56±153.28, P<0.05), with lower reading scores (1.68±5.64 vs. 11.20±11.16, P<0.05) and higher grade-level scores (54.93±35.81 vs. 16.70±2.03, P<0.05). In terms of understandability and actionability measured by the patient education materials assessment tool-patient (PEMAT-P) score, Claude 3.5 Sonnet demonstrated higher overall understandability (0.38±0.17 vs. 0.06±0.05, P<0.05) and actionability (0.25±0.22 vs. 0.08±0.09, P=0.015). Nevertheless, ChatGPT o1 had a higher overall accuracy score (4.88±0.28 vs. 4.58±0.37, P=0.002 2) and attained a superior quality rating under the revised evidence-based quality indicator for patient education (EQIP) criteria (7.47±1.28 vs. 5.75±1.20, P<0.05). Conclusion Claude 3.5 Sonnet is superior in conciseness, understandability, and actionability, whereas ChatGPT o1 is stronger in terms of accuracy, overall quality and readability.

    参考文献
    相似文献
    引证文献
引用本文

潘鑫,郜飞,陈亭亭,朱一鸣,程小凌,梁江盟,岳天宇,张政,雷齐鸣,卫旭东.颈部单中心型Castleman病临床辅助诊疗中ChatGPT o1和Claude 3.5 Sonnet的应用比较研究[J].中国耳鼻咽喉颅底外科杂志,2025,31(6):76-82

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-11-09
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-01-16
  • 出版日期: 2025-12-30
文章二维码
温馨提示

本刊唯一投稿网址:www.xyosbs.com
唯一办公邮箱:xyent@126.com
编辑部联系电话:0731-84327210,84327469
本刊从未委托任何单位、个人及其他网站代理征稿及办理其他业务联系,谨防上当受骗!

关闭