• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • Math Prog Technical Reports
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • Math Prog Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Mathematical Programming Approaches to Machine Learning and Data Mining

    Thumbnail
    File(s)
    Mathematical Programming Approaches to Machine Learning and Data Mining (894.0Kb)
    Date
    1998
    Author
    Bradley, Paul S.
    Metadata
    Show full item record
    Abstract
    Machine learning problems of supervised classification, unsupervised clustering and parsimonious approximation are formulated as mathematical programs. The feature selection problem arising in the supervised classification task is effectively addressed by calculating a separating plane by minimizing separation error and the number of problem features utilized. The support vector machine approach is formulated using various norms to measure the margin of separation. The clustering problem of assigning m points in n-dimensional real space to k clusters is formulated as minimizing a piecewise-linear concave function over a polyhedral set. This problem is also formulated in a novel fashion by minimizing the sum of squared distances of data points to nearest cluster planes characterizing the k cluster. The problem of obtaining a parsimonious solution to a linear system where the right hand side vector may be corrupted by noise is formulated as minimizing the system residual plus either the number of nonzero elements in the solution vector or the norm of the solution vector. The feature selection problem, the clustering problem and the parsimonious approximation problem can all be stated as the minimization of a concave function over a polyhedral region and are solved by a theoretically justifiable, fast and finite successive linearization algorithm. Numerical tests indicate the utility and efficiency of these formulations on real-world databases. In particular, the feature selection approach via concave minimization computes a separating-plane based classifier that improves upon the generalization ability of a separation plane computed without feature suppression. This approach produces classifiers utilizing fewer original problem features than the support vector machine approaches, with comparable generalization ability. The clustering techniques are showing to be effective and efficient data mining tools in medical survival analysis applications. The parsimonious approximation methods yield improved results in a signal processing application, with high signal to noise ratio, over least squares and a lengthy combinatorial search. These results support the claim that mathematical programming is effective as the basis of data mining tools to extract patterns from a database which contain "knowledge" and thus achieve "knowledge discovery in databases".
    Permanent Link
    http://digital.library.wisc.edu/1793/66268
    Type
    Technical Report
    Citation
    98-11
    Part of
    • Math Prog Technical Reports

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback