The Hidden TruthData analysis can be a strategic weapon in your company's management and control of fraud
By Girish Keshav Palshikar Continued from Page 1 Medical FraudI'll illustrate some of these techniques to handle the problem of fraud detection in a hypothetical and highly simplified medical insurance claims database. This database (as maintained by the insurance company and populated from the claim documents submitted by patients) consists of a single table and has the following format:
Let's evaluate whether a new specific claim is "suspicious" in some way. If so, the claim can be processed in a different way cancel claim payment, proceed with claim payment, recall claim, reduce payment amount, or seek clarification from hospital or patient. For the purpose of evaluating a new claim, you can often define various criteria or indices for suspiciousness. For each criteria or index, the claim gets a score; typically, high-score values in a specific index indicate greater suspiciousness. Thus, a claim that has high scores for many criteria is more suspicious. Examples of such criteria include:
You can define many more such indices. All such indices have to be defined rigorously; the previous descriptions are merely indicative. Ideally, the fraud control system can provide a facility to dynamically define such indices outside the system so that enhancements are easily possible. Because the indices represent knowledge about the fraud detection in claims warranty data, a rule language can capture it in a knowledge base. The system can provide a facility that lists similar claims to the given claim (based on k-nearest-neighbor algorithms, for example), along with a similarity matching score. This facility would enable the end user to evaluate the given claim with respect to similar claims. From a pool of already known fraudulent claims, machine-learning algorithms can construct a classification (such as a decision tree) that can help evaluate a new claim. As a simple example, you can check the disease (illness) ID against the duration and costs. Using the historical claims database, you can easily get a histogram of the hospital duration bins (0 to 2 days, 3 to 5 days, and so on) against the number of claims (this histogram will be for a specific illness ID, sex, and age group). You can then compare the claim duration against this histogram. If it falls in a sparsely populated bin, then it's at least a bit suspicious. Clustering of historical data can be used to automatically detect such outliers. Several types of calculations can be performed for fraud detection, such as regression analysis
and time-series analysis. In time-series analysis, the time-stamped data is analyzed for trends,
seasonal patterns, and outliers. The series is first transformed, if necessary, so that the variance
is constant. Additional assumptions may be needed because the observations in claims data aren't
necessarily at regular time intervals. Several time series in the claims data can be analyzed using
time-series analysis techniques. For example, the Suppose The following are some variables that are important for fraud detection in the claims data. Multiple regression analysis can be performed on chosen subsets of these variables:
Statistical analysis can also be performed for identifying outliers:
Some important temporal parameters for a claim include All such statistical analyses need to be studied in-depth and defined for the specific tasks of fraud detection and control in the medical claims domain. A large number of predefined statistical calculations oriented for detecting suspicious data can be provided. Fraud is an important phenomenon in today's wired commercial world. Fraud causes huge losses and damages an organization's reputation and good will. Fraud management is a complex and knowledge-intensive process involving deployment and effective use of tools based on a plethora of statistical and AI techniques. The author wishes to thank Prof. Mathai Joseph for his support. Thanks to Dr. Manasee Palshikar for her patience, hopes, and confidence. Girish Keshav Palshikar [girishp@pune.tcs.co.in] is a scientist at Tata Research Development and Design Centre (TRDDC) in Pune, India. TRDDC is the R&D Division of Tata Consultancy Services, India's largest software company. His areas of work include theory and applications of artificial intelligence. RESOURCESAI and Fraud Detection/Fraud Management: www.dinkla.net/fraud AI techniques in fraud management: www.aaai.org/AITopics/html/fraud.html Association of Certified Fraud Examiners: www.cfenet.com Communications Fraud Control Association: www.cfca.org Computer Fraud and Security Journal: www.elsevier.nl Medicare help line for fraud: www.medicare.gov/fraudabuse/overview.asp NASD Regulations: www.nasdr.com National Check Fraud Centre: www.ckfraud.org National Fraud Information Centre: www.fraud.org National Healthcare Anti-Fraud Association: www.nhcaa.org Online magazine for insurance fraud: www.fraudreport.com U.S. Securities and Exchange Commission: www.sec.gov
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||




















