17-10-2016, 02:32 PM
1459430953-creditcardstage1.docx (Size: 353.95 KB / Downloads: 4)
Abstract
The most accepted payment mode is credit card forboth online and offline in today’s world, it provides cashlessshopping at every shop in all countries. It will be the mostconvenient way to do online shopping, paying bills etc. Hence, risks of fraud transaction using credit card has also beenincreasing. In the existing credit card fraud detection businessprocessing system, fraudulent transaction will be detected after transaction is done. It is difficult to find out fraudulent andregarding loses will be barred by issuing authorities. HiddenMarkov Model is the statistical tools for engineer and scientists to solve various problems. In this paper, it is shown that credit card fraud can be detected using Hidden Markov Model duringtransactions. Hidden Markov Model helps to obtain a highfraud coverage combined with a low false alarm rate.
INTRODUCTION
In day to day life credit cards are used for purchasing goods and services with the help of virtual card for onlinetransaction or physical card for offline transaction. In aphysical-card based purchase, the cardholder presents his card physically to a merchant for making a payment. To carryout fraudulent transactions in this kind of purchase, anattacker has to steal the credit card. If the cardholder does not realize the loss of card, it can lead to a substantial financialloss to the credit card company. In online payment mode,attackers need only little information for doing fraudulenttransaction (secure code, card number, expiration date etc.).
In this purchase method, mainly transactions will be donethrough Internet or telephone. To commit fraud in these typesof purchases, a fraudster simply needs to know the card details.
Most of the time, the genuine cardholder is not aware that someone else has seen or stolen his card information.The only way to detect this kind of fraud is to analyze thespending patterns on every card and to figure out any inconsistency with respect to the “usual” spending patterns.Fraud detection based on the analysis of existing purchasedata of cardholder is a promising way to reduce the rate of successful credit card frauds. Since humans tend to exhibitspecific behaviouristic profiles, every cardholder can berepresented by a set of patterns containing information about the typical purchase category, the time since the last purchase, the amount of money spent, etc. Deviation fromsuch patterns is a potential threat to the system.
LITERATURE REVIEW
Credit card fraud detection has drawn a lot ofresearch interest and a number of techniques, with specialemphasis on neural networks, data mining and distributed data mining have been suggested.Ghosh and Reilly have proposed credit card fraud detection with a neural network. They have built adetection system, which is trained on a large sample of labelled credit card account transactions. These transactionscontain example fraud cases due to lost cards, stolen cards,application fraud, counterfeit fraud, mail-order fraud, and nonreceived issue (NRI) fraud. Recently, Syeda et al. have used parallel granular neural networks (PGNNs) forimproving the speed of data mining and knowledge discovery process in credit card fraud detection. A complete system hasbeen implemented for this purpose. Stolfo et al. [3] suggest acredit card fraud detection system (FDS) using metalearning techniques to learn models of fraudulent credit cardtransactions. Metalearning is a general strategy that providesa means for combining and integrating a number ofseparately built classifiers or models.
A metaclassifier is thustrained on the correlation of the predictions of the base classifiers. The same group has also worked on a cost-basedmodel for fraud and intrusion detection. They use Java agentsfor Metalearning (JAM), which is a distributed data mining system for credit card fraud detection. A number of importantperformance metrics like True Positive—False Positive(TP-FP) spread and accuracy have been defined by them. Aleskerov et al. present CARDWATCH, adatabase mining system used for credit card fraud detection. The system, based on a neural learning module, provides aninterface to a variety of commercial databases. Kim and Kimhave identified skewed distribution of data and mix of legitimate and fraudulent transactions as the two mainreasons for the complexity of credit card frauddetection.Based on this observation, they use fraud density of real transaction data as a confidence value and generate theweighted fraud score to reduce the number of misdetections.
Fan et al. suggest the application of distributeddata mining in credit card fraud detection. Brause et al. have developed an approach that involves advanced datamining techniques and neural network algorithms to obtain high fraud coverage. Chiu and Tsai have proposed Webservices and data mining techniques to establish a collaborative scheme for fraud detection in the bankingindustry. With this scheme, participating banks shareknowledge about the fraud patterns in a heterogeneous and distributed environment. To establish a smooth channel ofdata exchange, Web services techniques such as XML, SOAP, and WSDL are used. Phua et al. have done anextensive survey of existing data-mining based FDSs andpublished a comprehensive report. Prodromidis and Stolfo use an agent based approach with distributed learningfor detecting frauds in credit card transactions. It is based onartificial intelligence and combines inductive learning algorithms and metal earning methods for achieving higheraccuracy.
Phua et al. suggest the use of metaclassifiersimilar to in fraud detection problems. They consider naïve Bayesian, and Back Propagation neural networks as the baseclassifiers. A metaclassifier is used to determine which classifier should be considered based on skewness of data.Although they do not directly use credit card fraud detectionas the target application, their approach is quite generic. Vatsa et al. have recently proposed a game-theoreticapproach to credit card fraud detection. They model theinteraction between an attacker and an FDS as a multi stage game between two players, each trying to maximize hispayoff.HMM-based applications are common in various areas such as speech recognition, bioinformatics, andgenomics. In recent years, Joshi and Phobahaveinvestigated the capabilities of HMM in anomaly detection.
They classify TCP network traffic as an attack or normalusing HMM. Cho and Parksuggest an HMM-basedintrusion detection system that improves the modeling time and performance by considering only the privilege transitionflows based on the domain knowledge of attacks. Ourston etal. have proposed the application of HMM in detecting multistage network attacks. Hoang et al. present a newmethod to process sequences of system calls for anomalydetection using HMM. The key idea is to build a multilayer model of program behaviours based on both HMMs andenumerating methods for anomaly detection. Lane hasused HMM to model human behaviour. Once human behaviour is correctly modeled, any detected deviation is a cause forconcern since an attacker is not expected to have behavioursimilar to the genuine user. Hence, an alarm is raised in caseof any deviation.
Objective of the Project:
The objective of this application is as follows:
Creating an application to detect fraud Credit Cards.
Implementing Hidden Markov model.
Creating database containing all relevant information of Customer.
Providing security to the customers at the time of transaction.
Implementing firewall to restrict entry outside the Network.
Features and Functionality
This project is based upon Hidden Markov Model.
HMMs can be applied to the gesture-recognition problem.
It is primarily concerned with empirically picking the HMM parameters from some sample data.
Detects fraud to analyze the spending patterns on every card and to figure out any inconsistency with respect to the “usual” spending patterns.
Fraud detection system is based on the analysis of existing purchase data of cardholder
It is a promising way to reduce the rate of successful credit card frauds.
.Advantage of the Project:
Proper security provisions are made from malicious threats and hacking tools so that user account cannot be harmed intentionally or non intentionally from frauds.
Proper hierarchy of the users is maintained as per authority to access the data and use the services provided by the authority.
Track all the necessary details during transaction process.
In case of the existing system even the original card holder is also checked for fraud detection. But in this system no need to check the original user as we maintain a log.
The log which is maintained will also be a proof for the bank for the transaction made.
This reduces the tedious work of an employee in the bank.
We can find the most accurate detection using this technique.
Project Applications:
The different areas where we can use this application are :
It can be used on the server of any organization providing financial transactions.
It can also be used in banks and modifications can be easily done according to requirements.
Project Risks:
Proper network security is necessary.
Frauds can be done using SQL injection.
9. Future Enhancements:
Speed of the software can be enhanced by implementation of algorithms of less complexity.
Inter mail server can be implemented using the same concept.
Project Limitations:
It works only on windows XP/2000/vista/7.
Proper Document hierarchy should be maintained for accessing the required credit card.
Files and folders of a user deleted accidently can not be recovered.
The Service Provider’s Switching System
Before delving into the provider’s fraud detection service, it is important to understand its transaction switch that provides the backbone for this service.
As shown in Figure 1, the service provider maintains a four-node NonStop active/active system3 to supply transaction-switching services. Each node is a multi-processor NonStop Integrity Server. Two nodes of the active/active system are located in the provider’s northeastern U.S. data center, and the two other nodes are located in the provider’s southeastern U.S. data center. The geographical separation of the two data centers guarantees survivability in the event of either a localized or regional disaster that takes down one of the centers.
Existing System
Issuing banks typically provide their own fraud detection. A typical fraud detection process proceeds as follows.
The important information associated with a transaction is written to a transaction log by the bank’s authorization system. This information includes the card identification, the transaction amount, the time of the transaction, and the location of the transaction. The log is sent to a separate fraud detection system every few hours. The fraud detection system is optimized to perform complex analyses on the transaction log to look for fraudulent activity against any particular card or account. In this way, a series of transactions made over time against a card or an account can be analyzed.
The fraud detection system flags a suspicious transaction with a severity flag and writes this information to another log. Periodically, the log is sent to the bank’s authorization system, which takes appropriate action on the card. A credit hold might be placed on the card so that all further transactions will be rejected until the issue is researched or until the problem is resolved. Alternatively, upon the next attempted transaction, the merchant might be informed to ask the customer to call the bank in order to authorize the transaction.
This method is still generally the primary fraud detection procedure in use today. The problem with this method is that it typically takes hours or even days to flag a card that is perhaps being used fraudulently. During this time, the bank can experience significant losses as additional fraudulent transactions are made. In general, the bank is responsible for such transactions.
Fraud Detection in Real-Time:
The transaction-switching service provider realized that there was an opportunity to provide a unique and important service to the issuing banks. If it could detect suspicious or fraudulent activity in real-time, it could stop fraudulent transactions at the retail counter or at the ATM much sooner, or in some cases, even before they were authorized. This service would be a value-added service that would distinguish it from other ATM/POS switching networks.
Providing systems fast and powerful enough to perform this service would be quite expensive and complex – it would require integrating the disparate applications of transaction authorization and fraud detection in new and complex ways – a hindrance to any issuing bank that might want to build its own real-time fraud detection system. However, by building such a system that could be shared by many issuing banks, the cost could be justified.
To implement this system, the switching provider installed multiple high-performance servers that could quickly analyze transactions on-the-fly to determine if they were suspicious. The selected servers were large Sun Solaris servers running Oracle databases. Each server comprised eight quad-core CPUs.
Each data center is provided with its own fraud detection complex, comprising multiple Sun Solaris/Oracle servers (the cards/accounts are assigned to particular Sun Solaris/Oracle servers at a site in order to partition the work load). The fraud detection complex is easily scalable to handle additional load by adding additional servers and reassigning the accounts/cards accordingly.
In this new approach, when a transaction is received by a switch node, it is sent not only to the issuing bank for authorization, but it is also replicated in real-time to a fraud detection server via a Shadowbase replication engine. Shadowbase engine routes the transaction to the particular fraud detection server that is monitoring that card or account. Transaction distribution by card number or account is accomplished via routing rules configured into the Shadowbase replication engine.
The powerful fraud detection system rapidly analyzes the transaction on-the-fly and, if suspicious, notifies the switch node via reverse replication using the Shadowbase replication engine. This notification contains a severity flag indicating the degree of suspicion. The goal is to be fast enough to beat the issuing bank’s response so that the transaction can be flagged before it is returned to the servicing bank or to the merchant’s POS device. In any case, the issuing bank is notified of the fraud finding. If the flagged message is received by the switch node after the authorization response has been returned to the merchant or to the issuing bank, at least the issuing bank (and the provider’s message switch) has been notified and can take appropriate action on the next transaction, far sooner than it would otherwise be notified under the old fraud detection approach.
Of course, transactions for a given card may come into both data centers as the card is used at different stores or ATMs. Therefore, in order for fully effective fraud detection to work, both data centers must know all of the transactions for all cards. This is accomplished by the switch nodes, which block transactions as they are received and send them in near-real-time to the opposite data center, where they are recorded in the fraud detection system at that data center. In this way, both fraud detection complexes know about all transactions for each card and can effectively monitor each card transaction for fraudulent use no matter which switching node receives the transaction. Additionally, this provides a full disaster tolerant back up of the fraud detection services at each node, should one of the data centers be lost in a disaster.
The action taken by the switch node for a suspicious transaction can be configured to correspond to the desires of the issuing bank and the merchant. Different levels of actions can be specified for different levels of suspicion severity. In some cases, it may be desirable to reject the transaction. In others, it may be desirable to request that the customer call the bank before this transaction or additional transactions can be authorized. In still other situations, the issuing bank may want to allow the transaction but leave a voice or e-mail message for the customer notifying him of a potentially suspicious transaction.
Data Mining for Credit/Debit card Fraud Detection:
Data mining is popularly used to combat frauds because of its effectiveness. It is a well defined procedure that takes data as input and produces models or patterns as output. Neural network, a data mining technique was used in this study. The design of the neural network (NN) architecture for the credit card detection system was based on unsupervised method, which was applied to the transactions data to generate four clusters of low, high, risky and high-risk clusters. The self-organizing map neural network (SOMNN) technique was used for solving the problem of carrying out optimal classification of each transaction into its associated group, since a prior output is unknown. The receiver-operating curve (ROC) for credit card fraud (CCF) detection watch detected over 95% of fraud cases without causing false alarms unlike other statistical models and the two-stage clusters. This shows that the performance of CCF detection watch is in agreement with other detection software, but performs better.
Hidden Markov Model:
A Hidden Markov Model is a finite set of states;each state is linked with a probability distribution.Transitions among these states are governed by set probabilities called transition probabilities. In a particularstate a possible outcome or observation can be generatedwhich is associated symbol of observation of probability distribution. It is only the outcome, not the state that is visibleto an external observer and therefore states are ``hidden''the outside; hence the name Hidden Markov Model. Hence, Hidden Markov Model is a perfect solution for addressingdetection of fraud transaction through credit card. One moreimportant benefit of the HMM-based approach is an extremedecrease in the number of False Positives transactionsrecognized as malicious by a fraud detection system even though they are really genuine. In this prediction process,HMM consider mainly three price value ranges such as:
1) Low (l),
2) Medium (m) and,
3) High (h).
First, it will be required to find out transaction amountbelongs to a particular category either it will be in low,medium, or high ranges.
Credit Card Fraud Detection
In this section, it is shown that system of credit cardfraud detection based on Hidden Markov Model, which doesnot require fraud signatures and still it is capable to detect frauds just by bearing in mind a cardholder’s spending habit.The particulars of purchased items in single transactions are generallyunknown to any Credit card Fraud Detection System running either at the bank that issues credit cards to the cardholders orat the merchant site where goods is going to be purchased.As business processing of credit card fraud detection system runs on a credit card issuing bank site ormerchant site. Each arriving transaction is submitted to thefraud detection system for verification purpose. The frauddetection system accept the card details such as credit cardnumber, cvv number, card type, expiry date and the amount of items purchase to validate, whether the transaction isgenuine or not.
The implementation techniques of Hidden MarkovModel in order to detect fraud transaction through credit cards, it create clusters of training set and identify thespending profile of cardholder.
The number of itemspurchased, types of items that are bought in a particulartransaction are not known to the Fraud Detection system, butit only concentrates on the amount of item purchased and usefor further processing.
It stores data of different amount oftransactions in form of clusters dependingon transaction amount which will be either in low, medium orhigh value ranges. It tries to find out any variance in the transactionbased on the spending behavioral profile of the cardholder,shipping address, and billing address and so on.
The probabilities of initial set have chosen based on the spendingbehavioral profile of card holder and construct a sequence forfurther processing. If the fraud detection system makes sure that the transaction to be of fraudulent, it raises an alarm, andthe issuing bank declines the transaction. For the security purpose, the Security informationmodule will get the information features and its store’s indatabase.If the card lost then the Security information module form arises to accept the security information.
The securityform has a number of security questions like account number,date of birth, mother name, other personal question and theiranswer, etc. where the user has to answer it correctly to moveto the transaction section. All these information must beknown by the card holder only. It has informational privacy and informational self determination that are addressedevenly by the innovation affording people and entities atrusted means to user, secure, search, process, and exchange personal and/or confidential information.
The system and tools for pre-authorizing business provided that a connections tool to a retailer and a credit cardowner. The cardholder initiates a credit card transactionprocessing by communicating to a credit card number, cardtype with expiry date and storing it into database, a distinctivepiece of information that characterizes a particular transaction to be made by an authoritative user of the creditcard at a later time.
The details are received as network data in thedatabase only if an accurate individual recognition code is used with the communication. The cardholder or otherauthoritative user can then only make that particulartransaction with the credit card. Since the transaction ispre-authorized, the vendor does not need to see or transmit an accurate individual recognition code.
4.1. Neural networks:
Neural network is defined as a set of interconnected nodes designed to represent functioning of the human brain. Each node has a weighted connection to several other linked nodes in adjacent layers. Single node take input received from linked nodes and use the weights of the connected nodes together with easy function for computation of output values. Neural networks can be created for supervised and/or unsupervised learning. The user specifies the number of hidden layers along with the number of nodes within a specific hidden layer. The output layer of the neural network may contain one or several nodes depending upon the application. Recently, neural network researchers have several associated methods from statistics and numerical analysis into their networks. From the
given cases, nonlinear mapping relations from the input space to output space. Neural networks can learn and summarizes the internal assumptions of data even without knowledge of the potential data principles in advance. According to Rumelhart, (1986), Neural networks topologies, or architectures, formed by organizing nodes into layers and attach layers of neurons with modified weighted interconnections And it can match its own behavior to the new environment along with the results of formation of evolution capability from present environment to the new possible situation. Statistical methods are sometime unusual in the practice research even though the common advantages of the neural networks in application of credit card fraud detection. On the other side, there are still many disadvantages for the neural networks, such as
(1) Difficulty to confirm the structure,
(2) Excessive training,
(3) Efficiency of training and so on.
Genetic algorithms:
For predictive purposes, algorithms are often acclaimed as a means of detecting fraud. In order to establish logic rules which is capable of classifying credit card transactions into suspicious and non-suspicious classes, one algorithm that has been suggested by Bentley et al. (2000) that is based on genetic programming. However, this method follows the scoring process. In the experiment as described in their study, the database was made of 4,000 transactions along with 62 fields. As for the similarity, tree, training and testing samples were employed. For this purpose, different types of rules were tested with the different fields. The best rule among these is with the highest predictability. Their method has proven results for real home insurance data and could be one best method against credit card fraud. Chan et al. (1999) has developed an algorithm for prediction of suspect behavior. Origin of their research is that cost model evaluated and rated b whereas other studies use evaluation based on their prediction rate/the True Positive Rate (TPR) and the error rate/the False NegativE Rate (FNR). Wheeler & Aitken (2000) formed the idea of combining different algorithms to maximize the power of prediction. Article by, Wheeler & Aitken, presents different algorithms: diagnostic algorithms, diagnostic resolution strategies, , best match algorithms, density selection algorithms, probabilistic curve algorithms and negative selection algorithms.
Clustering techniques:
Two clustering techniques have been suggested for behavioral fraud by Bolton & Hand (2002). Peer group analysis is a system that allows identifying accounts which are behaving differently from others at one moment in time whereas
previously, they were behaving the same. These certain accounts are then flagged as suspicious. Then fraud analysts have been used to uncover those cases. Hypothesis behind peer group analysis is that if accounts that were behaving the same for a certain period of time and then one account, still behaving significantly differently, then this account has to be notified.
Outlier Detection:
Outliers are a basic form of non-standard attention that can be used for fraud detection. An observation that deviates much from other observations that arises suspicion that it was generated by a different mechanism is known as outlier. Unsupervised learning approach is employed by this model. Generally, the result of unsupervised learning is a new explanation or representation of the observed data, which will then lead to improved future decisions. Unsupervised methods do not need the prior knowledge of fraudulent and non-fraudulent transactions in historical database, but instead unsupervised learning detect changes in behavior and/or unusual transactions. These methods involve modeling of baseline distribution that represents normal behavior and then detects observations that show deviation from this norm. On other side, supervised methods, models are trained to discriminate between fraudulent and non-fraudulent transaction so that new observations can be assigned to classes. In supervised methods, they require accurate identification of fraud. In historical databases fraudulent transactions, can only be used to detect frauds of a type that have previously occurred.
Application Description
In existing models, the bank is verified credit cardinformation, CVV number, Date of expiry etc., but all theseinformation are available on the card itself. Nowadays, bank is also requesting to register your credit card for online securepassword. In this new model, after feeding details of card atmerchant site, then it will transfer to a secure gateway which is established at bank’s own server. But, it is not verifyingthat the transaction is fraudulent or not. If hackers will getsecure code of credit card by phishing sites or any other source, then it is very difficult to trace fraudulent transaction.In proposed model based on HMM will help to verifyfraudulent of transaction during transaction will be going to happen. It includes two modules are as follow
Online Shopping:
It comprises with many steps, first is to login into aparticular site to purchase goods or services, then choose anitem and next step is to go to payment mode where credit cardinformation will be required. After filling all theseinformation, now the page will be directed to proposed frauddetection system which will be installed at bank’s server ormerchant site.
Fraud Detection System:
All the information about credit card (Like Creditcard number, credit card CVV number, credit card Expirymonth and year, name on credit card etc.) will be checked with credit card database. If User entered database is correctthen it will ask Personal Identity number (PIN). Aftermatching of Personal Identity number (PIN) with database and account balance of user’s credit card is more than thepurchase amount, the fraud checking module will be activated.The verification of all data will be checked beforethe first page load of credit card fraud detection system. Ifuser credit card has less than 10 transactions then it will directly ask to provide personal information to do thetransaction. Once database of 10 transactions will bedeveloped, then fraud detection system will start to work.By using this observation, determine users spendingprofile. The purchase amount will be checked with spending profile of user. By transition probabilistic calculation basedon HMM, it concludes whether the transaction is real orfraud. If transaction may be concluded as fraudulent transaction then user must enter security information. Thisinformation is related with credit card (like account number,security question and answer which are provided at the time of registration). If transaction will not be fraudulent then itwill direct to give permission for transaction. If the detected transaction is fraudulent then theSecurity information form will arise. It has a set of questionwhere the user has to answer them correctly to do the transaction. These forms have information such as personal,professional, address; dates of birth, etc are available in thedatabase. If user entered information will be matched with database information, then transaction will be done securely.And else user transaction will be terminated and transferredto online shopping website.
System Architecture:
The implemented architecture consists of two subsystems: database interface and credit card fraud (CCF) detection engine. The database interface subsystem is the entry point through which the transactions are read into the system. It is the system’s interface with the banking software. Visual Basic.Net was used for the design of CCF detection, that is, as a front-end while Microsoft Access was used for the design of training and test database, as back-end. In the CCF detection subsystem, each transaction entering into the system was passed to the host server where the corresponding transaction profile is further checked using neural networks and transactions business rules.