Skip to main content

ENTITY LINKING WITH A KNOWLEDGE BASE: ISSUES, TECHNIQUES, AND SOLUTIONS

Entity Linking with a Knowledge Base:

Issues, Techniques, and Solutions

Abstract

The large number of potential applications from bridging Web data with knowledge bases have led to an increase inthe entity linking research. Entity linking is the task to link entity mentions in text with their corresponding entities in a knowledgebase. Potential applications include information extraction, information retrieval, and knowledge base population. However, thistask is challenging due to name variations and entity ambiguity. In this survey, we present a thorough overview and analysisof the main approaches to entity linking, and discuss various applications, the evaluation of entity linking systems, and futuredirections.
Entity linking can facilitate many different taskssuch as knowledge base population, question answering,and information integration. As the worldevolves, new facts are generated and digitally expressedon the Web. Therefore, enriching existingknowledge bases using new facts becomes increasinglyimportant. However, inserting newly extractedknowledge derived from the information extractionsystem into an existing knowledge base inevitablyneeds a system to map an entity mention associatedwith the extracted knowledge to the correspondingentity in the knowledge base. For example, relationextraction is the process of discovering useful relationshipsbetween entities mentioned in text and the extracted relation requires the process ofmapping entities associated with the relation to theknowledge base before it could be populated intothe knowledge base.


EXISTING SYSTEM
The information extractionsystem into an existing knowledge base inevitablyneeds a system to map an entity mention associatedwith the extracted knowledge to the correspondingentity in the knowledge base. On the other hand, an entitymention could possibly denote different named entities.For instance, the entity mention “Sun” can referto the star at the center of the Solar System, a multinationalcomputer company, a fictional character named“Sun-Hwa Kwon” on the ABC television series “Lost”or many other entities which can be referred to as“Sun”. An entity linking system has to disambiguatethe entity mention in the textual context and identifythe mapping entity for each entity mention.



PROPOSED SYSTEM:
proposed a probabilistic model which unifiesthe entity popularity model with the entity objectmodel to link the named entities in Web text withthe DBLP bibliographic network. We strongly believethat this direction deserves much deeper explorationby researchers.Finally, it is expected that more research or evena better understanding of the entity linking problemmay lead to the emergence of more effective and efficiententity linking systems, as well as improvementsin the areas of information extraction and SemanticWeb.

MODULE DESCRIPTION:

Number of Modules:

After careful analysis the system has been identified to have the following modules:
1. Entity linking
2. knowledge base
3. Candidate Entity Ranking.

1.Entity linking
          Entity linking can facilitate many different taskssuch as knowledge base population, question answering,and information integration. As the worldevolves, new facts are generated and digitally expressedon the Web. Therefore, enriching existingknowledge bases using new facts becomes increasinglyimportant. However, inserting newly extractedknowledge derived from the information extractionsystem into an existing knowledge base inevitablyneeds a system to map an entity mention associatedwith the extracted knowledge to the correspondingentity in the knowledge base. For example, relationextraction is the process of discovering useful relationshipsbetween entities mentioned in text and the extracted relation requires the process ofmapping entities associated with the relation to theknowledge base before it could be populated intothe knowledge base. Furthermore, a large numberof question answering systems rely on their supportedknowledge bases to give the answer to theuser’s question. To answer the question “What isthe birthdate of the famous basketball player MichaelJordan?”, the system should first leverage the entitylinking technique to map the queried “Michael Jordan”to the NBA player, instead of for example, theBerkeley professor; and then it retrieves the birthdateof the NBA player named “Michael Jordan” from theknowledge base directly. Additionally, entity linkinghelps powerful join and union operations that canintegrate information about entities across differentpages, documents, and sites.The entity linking task is challenging due to namevariations and entity ambiguity.
 
2. Knowledge base:
Given a knowledge base containing a set of entities Eand a text collection in which a set of named entitymentions M are identified in advance, the goal ofentity linking is to map each textual entity mentionm M to its corresponding entity e E in theknowledge base. Here, a named entity mention mis a token sequence in text which potentially refersto some named entity and is identified in advance.It is possible that some entity mention in text doesnot have its corresponding entity record in the givenknowledge base. We define this kind of mentions asunlinkable mentions and give NIL as a special labeldenoting “unlinkable”. Therefore, if the matchingentity e for entity mention m does not exist in theknowledge base an entity linking systemshould label m as NIL. For unlinkable mentions, thereare some studies that identify their fine-grained typesfrom the knowledge base which is outof scope for entity linking systems. Entity linking isalso called Named Entity Disambiguation (NED) inthe NLP community. In this paper, we just focus onentity linking for English language, rather than crosslingualentity linking Typically, the task of entity linking is precededby a named entity recognition stage, during whichboundaries of named entities in text are identified.While named entity recognition is not the focus ofthis survey, for the technical details of approachesused in the named entity recognition task, you couldrefer to the survey paper and some specificmethods In addition, there are many publiclyavailable named entity recognition tools, suchas Stanford NER1, OpenNLP2, and LingPipe3. Finkelet al. introduced the approach used in StanfordNER. They leveraged Gibbs sampling augmentan existing Conditional Random Field based systemwith long-distance dependency models, enforcing labelconsistency and extraction template consistency.




3.Candidate Entity Ranking

In most cases, the size of the candidate entityset Em is larger than one. Researchers leveragedifferent kinds of evidence to rank the candidateentities in Em and try to find the entity e Emwhich is the most likely link for mention m. InSection  we will review the main techniquesused in this ranking process, including supervisedranking methods and To deal with the problem of predicting unlinkablementions, some work leverages this module tovalidate whether the top-ranked entity identifiedin the Candidate Entity Ranking module is thetarget entity for mention m. Otherwise, they returnNIL for mention m. In, we willgive an overview of the main approaches forpredicting unlinkable mentions.



System Configuration:

HARDWARE REQUIREMENTS:

         Hardware                             -     Pentium
         Speed                                   -     1.1 GHz
         RAM                                   -    1GB
         Hard Disk                           -    20 GB
         Floppy Drive                       -    1.44 MB
         Key Board                          -    Standard Windows Keyboard
         Mouse                                 -    Two or Three Button Mouse
         Monitor                               -    SVGA


SOFTWARE REQUIREMENTS:


          Operating System                     : Windows
          Technology                               : Java and J2EE
          Web Technologies                     : Html, JavaScript, CSS
           IDE                                          : My Eclipse
           Web Server                              : Tomcat
           Tool kit                                     : Android Phone
           Database                                  : My SQL
           Java Version                             : J2SDK1.5                 
                





Comments

Popular posts from this blog

Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search

Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search ABSTRACT: In this paper, With advances in geo-positioning technologies and geo-location services, there are a rapidly growing amount of spatiotextual objects collected in many applications such as location based services and social networks, in which an object is described by its spatial location and a set of keywords (terms). Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In the paper, we study two fundamental problems in the spatial keyword queries: top k spatial keyword search (TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio-textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch processing of sets...

A simple and reliable touch sensitive security system CODING

#include <REGX51.H> #include "lcd.c" #define MAX_DELAY() delay(65000) sbit Vibra_Sense=P3^1; sbit Buz=P1^0; void intro() {  lcd_init();  lcd_str("Touch Sensitive ",0x80);  lcd_str("Security System ",0xc0);  MAX_DELAY();MAX_DELAY();  lcd_clr();  }  void main()  { unsigned int i = 0, j= 0; intro();    while(1)    { lcd_str("Security Syst On",0x80); lcd_str("No Vibra Detectd",0xc0); Buz = 1; if(Vibra_Sense == 1) { while(Vibra_Sense == 1) delay(1000); } else { while(Vibra_Sense == 0) delay(1000); } Buz = 0; lcd_str("Vibraton Detectd",0xc0);delay(65000); while(1);    }  }

A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud

A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud Abstract                                                                                                                                                      ...