Document Fraud Detecting System
ABSTRACT:
A
data distributor has given sensitive data to a set of supposedly trusted agents
(third parties). Some of the data is leaked and found in an unauthorized place
(e.g., on the web or somebody’s laptop). The distributor must assess the
likelihood that the leaked data came from one or more agents, as opposed to
having been independently gathered by other means. We propose data allocation
strategies (across the agents) that improve the probability of identifying leakages.
These methods do not rely on alterations of the released data (e.g., watermarks). In some cases we can also
inject “realistic but fake” data records to further improve our chances of
detecting leakage and identifying the guilty party.
EXISTING SYSTEM:
Traditionally,
leakage detection is handled by watermarking, e.g., a unique code is embedded
in each distributed copy. If that copy is later discovered in the hands of an
unauthorized party, the leaker can be identified. Watermarks can be very useful
in some cases, but again, involve some modification of the original data. Furthermore,
watermarks can sometimes be destroyed if the data recipient is malicious. E.g. A hospital may give patient records
to researchers who will devise new treatments. Similarly, a company may have
partnerships with other companies that require sharing customer data. Another
enterprise may outsource its data processing, so data must be given to various
other companies. We call the owner of the data the distributor and the supposedly
trusted third parties the agents.
PROPOSED SYSTEM:
Our
goal is to detect when the distributor’s sensitive data has been leaked by
agents, and if possible to identify the agent that leaked the data. Perturbation is a
very useful technique where the data is modified and made “less sensitive”
before being handed to agents. we develop unobtrusive
techniques for detecting leakage of a set of objects or records.
In
this section we develop a model for assessing the “guilt” of agents. We also
present algorithms for distributing objects to agents, in a way that improves
our chances of identifying a leaker. Finally, we also consider the option of
adding “fake” objects to the distributed set. Such objects do not correspond to
real entities but appear realistic to the agents. In a sense, the fake objects
acts as a type of watermark for the entire set, without modifying any
individual members. If it turns out an agent was given one or more fake objects
that were leaked, then the distributor can be more confident that agent was
guilty.

Problem Setup and Notation:
A
distributor owns a set T={t1,…,tm}of valuable data objects. The distributor wants to
share some of the objects with a set of agents U1,U2,…Un, but does not wish the objects be leaked to other
third parties. The objects in T could be of any type and size,
e.g., they could be tuples in a relation, or relations in a database. An agent Ui receives a subset of objects, determined either by a
sample request or an explicit request:
1.
Sample request
2.
Explicit request
Guilt Model Analysis:
our
model parameters interact and to check if the interactions match our intuition,
in this section we study two simple scenarios as Impact of Probability p and Impact of Overlap between Ri and S. In
each scenario we have a target that has obtained all the distributor’s objects,
i.e., T = S.
Algorithms:
1.
Evaluation of Explicit Data Request Algorithms
In the first place, the
goal of these experiments was to see whether fake objects in the distributed
data sets yield significant improvement in our chances of detecting a guilty
agent. In the second place, we wanted to evaluate our e-optimal algorithm
relative to a random allocation.
2.
Evaluation of Sample Data Request Algorithms
With sample data requests
agents are not interested in particular objects. Hence, object sharing is not
explicitly defined by their requests. The distributor is “forced” to allocate
certain objects to multiple agents only if the number of requested objects
exceeds the number of objects in set T. The
more data objects the agents request in total, the more recipients on average
an object has; and the more objects are shared among different agents, the more
difficult it is to detect a guilty agent.
MODULES:
1. Data Allocation Module:
The main focus of our project is the data allocation
problem as how can the distributor “intelligently” give data to agents in order
to improve the chances of detecting a guilty agent,Admin can send the files to
the authenticated user, users can edit their account details etc. Agent views
the secret key details through mail. In order to increase the chances of
detecting agents that leak data.
2. Fake
Object Module:
The distributor creates and adds fake
objects to the data that he distributes to agents. Fake objects are objects generated
by the distributor in order to increase the chances of detecting agents that
leak data. The distributor may be able to add fake objects to the distributed
data in order to improve his effectiveness in detecting guilty agents. Our use
of fake objects is inspired by the use of “trace” records in mailing lists. In
case we give the wrong secret key to download the file, the duplicate file is
opened, and that fake details also send the mail. Ex: The fake object details
will display.
3. Optimization Module:
The Optimization Module is the distributor’s data allocation to agents
has one constraint and one objective. The agent’s constraint is to satisfy distributor’s
requests, by providing them with the number of objects they request or with all
available objects that satisfy their conditions. His objective is to be able to
detect an agent who leaks any portion of his data. User can able to lock and
unlock the files for secure.
4. Data
Distributor:
A data distributor has given sensitive data to a set of supposedly
trusted agents (third parties). Some of the data is leaked and found in an
unauthorized place (e.g., on the web or somebody’s laptop). The distributor
must assess the likelihood that the leaked data came from one or more agents,
as opposed to having been independently gathered by other means.Admin can able
to view the which file is leaking and fake user’s details also.
Hardware
Required:
v System : Pentium IV 2.4 GHz
v Hard Disk :
40 GB
v Floppy Drive : 1.44
MB
v Monitor : 15 VGA colour
v Mouse : Logitech.
v
Keyboard :
110 keys enhanced.
v RAM : 256 MB
S/W
System Configuration
v Operating
System : Windows
95/98/2000/NT4.0.
v
Application
Server : Wamp2.2e
v
Front End : HTML, PHP.
v
Scripts :
JavaScript.
v
Server side Script : PHP.
v
Database : Mysql.
v
Database Connectivity : PhpMyAdmin.
Comments
Post a Comment