Skip to main content

ENTITY LINKING WITH A KNOWLEDGE BASE: ISSUES, TECHNIQUES, AND SOLUTIONS

Entity Linking with a Knowledge Base:

Issues, Techniques, and Solutions

Abstract

The large number of potential applications from bridging Web data with knowledge bases have led to an increase inthe entity linking research. Entity linking is the task to link entity mentions in text with their corresponding entities in a knowledgebase. Potential applications include information extraction, information retrieval, and knowledge base population. However, thistask is challenging due to name variations and entity ambiguity. In this survey, we present a thorough overview and analysisof the main approaches to entity linking, and discuss various applications, the evaluation of entity linking systems, and futuredirections.
Entity linking can facilitate many different taskssuch as knowledge base population, question answering,and information integration. As the worldevolves, new facts are generated and digitally expressedon the Web. Therefore, enriching existingknowledge bases using new facts becomes increasinglyimportant. However, inserting newly extractedknowledge derived from the information extractionsystem into an existing knowledge base inevitablyneeds a system to map an entity mention associatedwith the extracted knowledge to the correspondingentity in the knowledge base. For example, relationextraction is the process of discovering useful relationshipsbetween entities mentioned in text and the extracted relation requires the process ofmapping entities associated with the relation to theknowledge base before it could be populated intothe knowledge base.


EXISTING SYSTEM
The information extractionsystem into an existing knowledge base inevitablyneeds a system to map an entity mention associatedwith the extracted knowledge to the correspondingentity in the knowledge base. On the other hand, an entitymention could possibly denote different named entities.For instance, the entity mention “Sun” can referto the star at the center of the Solar System, a multinationalcomputer company, a fictional character named“Sun-Hwa Kwon” on the ABC television series “Lost”or many other entities which can be referred to as“Sun”. An entity linking system has to disambiguatethe entity mention in the textual context and identifythe mapping entity for each entity mention.



PROPOSED SYSTEM:
proposed a probabilistic model which unifiesthe entity popularity model with the entity objectmodel to link the named entities in Web text withthe DBLP bibliographic network. We strongly believethat this direction deserves much deeper explorationby researchers.Finally, it is expected that more research or evena better understanding of the entity linking problemmay lead to the emergence of more effective and efficiententity linking systems, as well as improvementsin the areas of information extraction and SemanticWeb.

MODULE DESCRIPTION:

Number of Modules:

After careful analysis the system has been identified to have the following modules:
1. Entity linking
2. knowledge base
3. Candidate Entity Ranking.

1.Entity linking
          Entity linking can facilitate many different taskssuch as knowledge base population, question answering,and information integration. As the worldevolves, new facts are generated and digitally expressedon the Web. Therefore, enriching existingknowledge bases using new facts becomes increasinglyimportant. However, inserting newly extractedknowledge derived from the information extractionsystem into an existing knowledge base inevitablyneeds a system to map an entity mention associatedwith the extracted knowledge to the correspondingentity in the knowledge base. For example, relationextraction is the process of discovering useful relationshipsbetween entities mentioned in text and the extracted relation requires the process ofmapping entities associated with the relation to theknowledge base before it could be populated intothe knowledge base. Furthermore, a large numberof question answering systems rely on their supportedknowledge bases to give the answer to theuser’s question. To answer the question “What isthe birthdate of the famous basketball player MichaelJordan?”, the system should first leverage the entitylinking technique to map the queried “Michael Jordan”to the NBA player, instead of for example, theBerkeley professor; and then it retrieves the birthdateof the NBA player named “Michael Jordan” from theknowledge base directly. Additionally, entity linkinghelps powerful join and union operations that canintegrate information about entities across differentpages, documents, and sites.The entity linking task is challenging due to namevariations and entity ambiguity.
 
2. Knowledge base:
Given a knowledge base containing a set of entities Eand a text collection in which a set of named entitymentions M are identified in advance, the goal ofentity linking is to map each textual entity mentionm M to its corresponding entity e E in theknowledge base. Here, a named entity mention mis a token sequence in text which potentially refersto some named entity and is identified in advance.It is possible that some entity mention in text doesnot have its corresponding entity record in the givenknowledge base. We define this kind of mentions asunlinkable mentions and give NIL as a special labeldenoting “unlinkable”. Therefore, if the matchingentity e for entity mention m does not exist in theknowledge base an entity linking systemshould label m as NIL. For unlinkable mentions, thereare some studies that identify their fine-grained typesfrom the knowledge base which is outof scope for entity linking systems. Entity linking isalso called Named Entity Disambiguation (NED) inthe NLP community. In this paper, we just focus onentity linking for English language, rather than crosslingualentity linking Typically, the task of entity linking is precededby a named entity recognition stage, during whichboundaries of named entities in text are identified.While named entity recognition is not the focus ofthis survey, for the technical details of approachesused in the named entity recognition task, you couldrefer to the survey paper and some specificmethods In addition, there are many publiclyavailable named entity recognition tools, suchas Stanford NER1, OpenNLP2, and LingPipe3. Finkelet al. introduced the approach used in StanfordNER. They leveraged Gibbs sampling augmentan existing Conditional Random Field based systemwith long-distance dependency models, enforcing labelconsistency and extraction template consistency.




3.Candidate Entity Ranking

In most cases, the size of the candidate entityset Em is larger than one. Researchers leveragedifferent kinds of evidence to rank the candidateentities in Em and try to find the entity e Emwhich is the most likely link for mention m. InSection  we will review the main techniquesused in this ranking process, including supervisedranking methods and To deal with the problem of predicting unlinkablementions, some work leverages this module tovalidate whether the top-ranked entity identifiedin the Candidate Entity Ranking module is thetarget entity for mention m. Otherwise, they returnNIL for mention m. In, we willgive an overview of the main approaches forpredicting unlinkable mentions.



System Configuration:

HARDWARE REQUIREMENTS:

         Hardware                             -     Pentium
         Speed                                   -     1.1 GHz
         RAM                                   -    1GB
         Hard Disk                           -    20 GB
         Floppy Drive                       -    1.44 MB
         Key Board                          -    Standard Windows Keyboard
         Mouse                                 -    Two or Three Button Mouse
         Monitor                               -    SVGA


SOFTWARE REQUIREMENTS:


          Operating System                     : Windows
          Technology                               : Java and J2EE
          Web Technologies                     : Html, JavaScript, CSS
           IDE                                          : My Eclipse
           Web Server                              : Tomcat
           Tool kit                                     : Android Phone
           Database                                  : My SQL
           Java Version                             : J2SDK1.5                 
                





Comments

Popular posts from this blog

Jio

Reliance Jio planning its own  cryptocurrency called JioCoin  elder son Akash Ambani leading the JioCoin project, Reliance Jio plans to build a 50-member team of young professionals to work on blockchain technology, which can also be used to develop applications such as smart contracts and supply chain management logistics

PUNCHING MACHINE

ACCIDENT AVOIDING SYSTEM FOR PUNCHING MACHINE SYNOPSIS The aim of our project is to take a system-wide approach to preventing the machine accident. The system includes not just the machine and the operator; but rather, it includes everything from the initial design of the machine to the training of everyone that is responsible for any aspect of it, to the documentation of all changes, to regular safety audits and a finally a corporate culture of safety-first. Design is the part of a machine's life where the greatest impact can be made in relation to avoiding accidents. The designer should ensure that the machine is safe to set up and operate, safe to install, safe to maintain, safe to repair, and safe to decommission. Although safe operation is usually at the forefront of a designer's mind, safe maintenance and repair should also be a high priority. Around 50% of fatal accidents involving industrial equipment are associated with maintenance activities, and design...