Skip to main content

k-Nearest

k-Nearest  Neighbor  Classification over Semantically  Secure Encrypted Relational Data


Abstract:

Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade,due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure k-NN classifier over encrypted data in the cloud.

Existing System:
Ensuring the security of data is therefore critical not only to preserve the data’s of employees’ highly personal information, but also to minimize    the legal risk to the organization as a whole. When some organizations not view the full datails of the job seekers CV.so for we have to provide the security for this CV.  When an organization takes care of reduce the manual workload an organization performs, they choose to replace those processes with various levels of security   systems.
Disadvantage:
     Data leakage..data theft by third person, clustering is not efficient.

Proposed System:
we propose a secure k-NN classifier over encrypted data in the cloud. The proposed k-NN protocol protects the confidentiality of the data, user’s input query, and data access patterns. To the best of our knowledge, our work is the first to develop a secure k-NN classifier over encrypted data under the standard semi-honest model. Also, we empirically analyze the efficiency of our solution through various experiments.

Algorithums:
               1.k-means clustering algorithum
               2. ElGamal Algorithums

Implementation Modules:
1.     Privacy-Preserving Data Mining (PPDM)
2.     Query processing over encrypted data.
3.     Security Analysis of Privacy-Preserving Primitives under the Semi-Honest Model
4.     Security proof.


Privacy-Preserving Data Mining (PPDM):
Privacy Preserving Data Mining (PPDM) is defined as the process of extracting/deriving the knowledge about data without compromising the privacy of data. In the past decade, many privacy-preserving classification techniques have been proposed in the literature in order to protect user privacy.  The notion of privacy-preserving under data mining applications. In particular to privacy preserving classification, the goal is to build a classifier in order to predict the class label of input data record based on the distributed training dataset without compromising the privacy of data.

Query processing over encrypted data:
The intermediate k-nearest neighbors in the classification process, should not be disclosed to the cloud or any users. We emphasize that the recent method in [54] reveals the k-nearest neighbors to the user. Secondly, even if we know the k-nearest neighbors, it is still very difficult to find the majority class label among these neighbors since they are encrypted at the first place to prevent the cloud from learning sensitive information. Third, the existing work did not addressed the access pattern issue which is a crucial privacy requirement from the user’s perspective.

Security Analysis of Privacy-Preserving Primitives under the Semi-Honest Model:
Here we provide a formal security proof for the proposed PPkNN protocol under the semi-honest model. First of all,we stress that due to the encryption of q and by semantic security of the Paillier cryptosystem, Bob’s input query q is protected from Alice, C1 and C2 in our PPkNN protocol. Apart from guaranteeing query privacy, remember that thegoal of PPkNN is to protect data confidentiality and hide data access patterns.In this paper, to prove a protocol’s security under the semi-honest model, we adopted the well-known securitydefinitions from the literature of secure multiparty computation (SMC). More specifically, as mentioned . we adopt the security proofs based on the standard simulation paradigm . For presentation purpose, weprovide formal security proofs (under the semi-honest model) for Stages 1 and 2 of PPkNN separately. Note that theoutputs returned by each sub-protocol are in encrypted form and known only to C1.

Security proof:
Data Security is the keeping data protected from corruption and unauthorized access and focus behind data security is to ensure the  privacy while protecting personal or(business) corporate data. This paper will manage complexity  of CV data’s. Its have various process like employees develop their personal and organizational skills, knowledge, and abilities. Security is of great aspect when it comes to choosing a human resources management system, especially when it means keeping company(corporate) data and the privacy of employee records safe from hackers. It is essential for companies to choose a solution(decision) that utilizes a method of secure transmission such as SSL which encrypts the data as it transmits over the job portals. An important is security is hiding of users particular details. So avoid the theft of the job seekers(or)employees details.

System Specifictions:
Hardware Requirements:System                          -    Pentium –IV 2.4 GHz

  • Speed                                -    1.1 Ghz
  • RAM                                 -  256MB(min)
  • Hard Disk                          -   40 GB
  • Key Board                         -    Standard Windows Keyboard
  • Mouse                                -  Logitech
  • Monitor                              -    15 VGA Color.

Software Requirements:

v   Operating System              :Windows/XP/7.
v   Application  Server            :   Tomcat 5.0/6.0                                           
v   Front End                          :   HTML, Java, Jsp
v    Scripts                                :   JavaScript.
v   Server side Script             :   Java Server Pages.
v   Database                            :   MongoDB
v   Database Connectivity      :   Robomongo-0.8.5-i386.






Comments

Popular posts from this blog

IDENTITY-BASED PROXY-ORIENTED DATA UPLOADING AND REMOTE DATA INTEGRITY CHECKING IN PUBLIC CLOUD report

IDENTITY-BASED PROXY-ORIENTED DATA UPLOADING AND REMOTE DATA INTEGRITY CHECKING IN PUBLIC CLOUD ABSTRACT More and more clients would like to store their data to PCS (public cloud servers) along with the rapid development of cloud computing. New security problems have to be solved in order to help more clients process their data in public cloud. When the client is restricted to access PCS, he will delegate its proxy to process his data and upload them. On the other hand, remote data integrity checking is also an important security problem in public cloud storage. It makes the clients check whether their outsourced data is kept intact without downloading the whole data. From the security problems, we propose a novel proxy-oriented data uploading and remote data integrity checking model in identity-based public key cryptography: IDPUIC (identity-based proxy-oriented data uploading and remote data integrity checking in public cloud). We give the formal definition, system model and se

A LOCALITY SENSITIVE LOW-RANK MODEL FOR IMAGE TAG COMPLETION

A LOCALITY SENSITIVE LOW-RANK MODEL FOR IMAGE TAG COMPLETION ABSTRACT Many visual applications have benefited from the outburst of web images, yet the imprecise and incomplete tags arbitrarily provided by users, as the thorn of the rose, may hamper the performance of retrieval or indexing systems relying on such data. In this paper, we propose a novel locality sensitive low-rank model for image tag completion, which approximates the global nonlinear model with a collection of local linear models. To effectively infuse the idea of locality sensitivity, a simple and effective pre-processing module is designed to learn suitable representation for data partition, and a global consensus regularizer is introduced to mitigate the risk of overfitting. Meanwhile, low-rank matrix factorization is employed as local models, where the local geometry structures are preserved for the low-dimensional representation of both tags and samples. Extensive empirical evaluations conducted on three

LIFI

LIFI Prof . Harald Haas is a technology of high brightness light emitting diodes(LED).It is bidirectional ,high speed and fully networked wireless communication.    LiFi is designed to use LED light bulbs similar to those currently in use in many energy-conscious homes and offices. However, LiFi bulbs are outfitted with a   chip   that modulates the light imperceptibly for optical data transmission. LiFi data is transmitted by the LED bulbs and received by photoreceptors. LiFi's early developmental models were capable of 150 megabits-per-second ( Mbps ). Some commercial kits enabling that speed have been released. In the lab, with stronger LEDs and different technology, researchers have enabled 10   gigabits -per-second (Gbps), which is faster than   802.11ad .  Benefits of LiFi: ·         Higher speeds than  Wi-Fi . ·         10000 times the frequency  spectrum  of radio. ·         More secure because data cannot be intercepted without a clear line of si