Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

IntelliClean: A Knowledge-Based Intelligent Data Cleaner

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      The Pennsylvania State University CiteSeerX Archives
    • الموضوع:
      2000
    • Collection:
      CiteSeerX
    • نبذة مختصرة :
      Existing data cleaning methods work on the basis of computing the degree of similarity between nearby records in a sorted database. High recall is achieved by accepting records with low degrees of similarity as duplicates, at the cost of lower precision. High precision is achieved analogously at the cost of lower recall. This is the recall-precision dilemma. In this paper, we propose a generic knowledge-based framework for effective data cleaning that implements existing cleaning strategies and more. We develop a new method to compute transitive closure under uncertainty which handles the merging of groups of inexact duplicate records. Experimental results show that this framework can identify duplicates and anomalies with high recall and precision.
    • File Description:
      application/postscript
    • Relation:
      http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.7204; http://www.comp.nus.edu.sg/~leeml/papers/kdd00.ps
    • الدخول الالكتروني :
      http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.7204
      http://www.comp.nus.edu.sg/~leeml/papers/kdd00.ps
    • Rights:
      Metadata may be used without restrictions as long as the oai identifier remains attached to it.
    • الرقم المعرف:
      edsbas.9B5CD11B