U.S. flag An official website of the United States government.
Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

i

A New Descriptor Selection Scheme for SVM in Unbalanced Class Problem: A Case Study Using Skin Sensitisation Dataset

Public Domain


Details

  • Personal Author:
  • Description:
    A novel descriptor selection scheme for Support Vector Machine (SVM) classification method has been proposed and its utility demonstrated using a skin sensitisation dataset as an example. A backward elimination procedure, guided by mean accuracy (the average of specificity and sensitivity) of a leave-one-out cross validation, is devised for the SVM. Subsets of descriptors were first selected using a sequential t-test filter or a Random Forest filter, before backward elimination was applied. Different kernels for SVM were compared using this descriptor selection scheme. The Radial Basis Function (RBF) kernel worked best when a sequential t-test filter was adopted. The highest mean accuracy, 84.9%, was obtained using SVM with 23 descriptors. The sensitivity and the specificity were as high as 93.1% and 76.6%, respectively. A linear kernel was found to be optimal when a Random Forest filter was used. The performance using 24 descriptors was comparable with a RBF kernel with a sequential t-test filter. As a comparison, Fisher's linear discriminant analysis (LDA) under the same descriptor selection scheme was carried out. SVM was shown to outperform the LDA. [Description provided by NIOSH]
  • Subjects:
  • Keywords:
  • ISSN:
    1062-936X
  • Document Type:
  • Genre:
  • Place as Subject:
  • CIO:
  • Division:
  • Topic:
  • Location:
  • Pages in Document:
    423-441
  • Volume:
    18
  • Issue:
    5
  • NIOSHTIC Number:
    nn:20032635
  • Citation:
    SAR QSAR Environ Res 2007 Jul; 18(5-6):423-441
  • Contact Point Address:
    S. Li, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, WV 26505
  • Email:
    swl4@cdc.gov
  • Federal Fiscal Year:
    2007
  • Peer Reviewed:
    True
  • Source Full Name:
    SAR and QSAR in Environmental Research
  • Collection(s):
  • Main Document Checksum:
    urn:sha-512:92e8372aec277adbf965d0f7237dcd5907ca025b23a76ade165fac2d03857ae36946faf8dcca8e2ffc12178a6fd76c3a98a72ed12f41c4cda8465d4534d122f7
  • Download URL:
  • File Type:
    Filetype[PDF - 1.64 MB ]
ON THIS PAGE

CDC STACKS serves as an archival repository of CDC-published products including scientific findings, journal articles, guidelines, recommendations, or other public health information authored or co-authored by CDC or funded partners.

As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.