引用本文:尹延洁,崔雷.利用MeSH组配规则自动抽取表达特定语义关系句子的探索[J].中华医学图书情报杂志,2019,28(10):34-41.
利用MeSH组配规则自动抽取表达特定语义关系句子的探索
Automatic extraction of sentences expressing specific semantic relationships according to the asseembly rules of MeSH
DOI:10.3969/j.issn.1671-3982.2019.10.005
中文关键词:  自然语言处理  关系抽取  知识表达
英文关键词:Natural languide processing  Relationship extraction  Knowledge expression
基金项目:国家自然基金项目“运用文本数据库中元数据关联规则进行知识发现的研究”(70473103)研究成果之一
作者单位
尹延洁 中国医科大学医学信息学院辽宁 沈阳 110122 
崔雷 中国医科大学医学信息学院辽宁 沈阳 110122 
摘要点击次数: 55
全文下载次数: 28
中文摘要:
      目的:利用MeSH组配规则自动抽取文摘中表达特定语义关系的句子,为制定自然语言处理关系抽取模板以及句子水平的信息检索提供依据。方法:根据主题词组配规则,使用python语言从文摘数据中匹配出含有特定MeSH主题词概念的候选关系句,从中抽取出以描述概念间关系的短语或句子。邀请专家对100条候选关系句进行概念间语义关系人工标注,将得到的语义关系三元组作为评价金标准,与自动抽取出的概念间关系进行对比分析。将自动抽取的结果加以整理形成特定概念之间的语义关系表达。结果:对大量的自然文本句进行句法分析,批量识别出2个特定概念间语义关系抽取方法的准确率为87%,召回率为62%,F1=71.8%。结论:利用MeSH组配规则抽取表达特定语义关系句子的方法具有较高的准确率与召回率,对生物医学文本理解及医学知识发现等具有借鉴意义。
英文摘要:
      Objective To automatically extract the sentences expressing specific semantic relationships in abstracts according to the asseembly rules of MeSH in order to provide the basis for developing the extraction module of natural languige processing relationships and information retrieval at the sentence level. Methods The candidate sentences containing specific concepts of MeSH in abstracts were matched using python languige according to the assembly rules of subject headings, from which the phrases or sentences describe the semantic relationships. Experts were invited to manually index the inter-concept semantic relationships in 100 candidate sentences. The semantic relationship triples were used as the golden standard for evaluation. The semantic relationships between automatically extracted concepts were comparatively analyzed. The automatically extracted concepts were organized to form the semantic relationship expression in specific concepts. Results Sentence structure analysis of a large number of natural languige texts identified two specific inter-concept semantic relationship extraction methods with an accuracy of 87%, a recall rate of 62% and F1 of 71.8%. Conclusion The accuracy and recall rate of automatic extraction of sentences expressing specific semantic relationships accoring to the asseembly rules of MeSH are rather high, which are of certern significance for understanding of biomedical texts and discovery of medical knowledge.
查看全文  查看/发表评论  下载PDF阅读器
关闭
手机扫一扫看