![]() ![]() ![]() Start to pay attention to carrying out semantic analysis using ontology. ![]() ![]() The new representation of knowledge and description form is widely applied to the various aspects such as semantic net, information retrieval, more and more researchers This expression calculates vector similarity as text similarity directly in vector space later.In recent years, ontology, as one kind What is wanted is exactly a little to calculate patent text similarity.Text similarity, general algorithmic method are using vector space model to text The semantic information for itself being included.The essence of patent examination is the high related patents of unexamined patent similarity, among these, most heavy Technology.The fast development of science and technology makes annual amount of the application for patent sharply increase.Traditional retrieval mode passes through termĬarry out matching return as a result, being usually correlation using the quantity that term occurs as patent, not in view of patent DescriptionĬurrent Internet era, carrier of the patent as record mankind's achievement contain a large amount of scientific and technological achievement and innovation The present invention relates to a kind of Chinese patent text similarity calculating methods, including:Text is segmented;TF IDF values are calculated to word segmentation result, extraction TF IDF values are higher to be used as keyword, and the sentence where positioning keyword obtains the critical sentence set of each text as critical sentence, and using the maximum weights of keyword in critical sentence as the weights of critical sentence;The weight to text for calculating each critical sentence chooses text to be compared and compares the critical sentence of text successively, and the sentence similarity based on critical sentence calculates the similarity of text.The present invention utilizes existing patent field ontology, analyze the semantic relation in patent text, the calculating of patent text similarity is carried out using vector space model and domain body, the accuracy and recall rate of result of calculation are higher, similarity degree between patent can be described more accurately, it can accelerate the speed of patent examination, the needs of practical application can be met well. G06F18/214- Generating training patterns Bootstrap methods, e.g.G06F18/21- Design or setup of recognition systems or techniques Extraction of features in feature space Blind source separation.G06F40/279- Recognition of textual entities.G06F40/00- Handling natural language data.G06- COMPUTING CALCULATING OR COUNTING.Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.) Filing date Publication date Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University Priority to CN201810310198.1A priority Critical patent/CN108549634A/en Publication of CN108549634A publication Critical patent/CN108549634A/en Status Pending legal-status Critical Current Links Original Assignee Beijing Information Science and Technology University Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)īeijing Information Science and Technology University Inventor 吕学强 董志安 Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.) Pending Application number CN201810310198.1A Other languages Chinese ( zh) Google Patents A kind of Chinese patent text similarity calculating methodĭownload PDF Info Publication number CN108549634A CN108549634A CN201810310198.1A CN201810310198A CN108549634A CN 108549634 A CN108549634 A CN 108549634A CN 201810310198 A CN201810310198 A CN 201810310198A CN 108549634 A CN108549634 A CN 108549634A Authority CN China Prior art keywords sentence similarity word text words Prior art date Legal status (The legal status is an assumption and is not a legal conclusion. Google Patents CN108549634A - A kind of Chinese patent text similarity calculating method CN108549634A - A kind of Chinese patent text similarity calculating method ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |