Text classification method review

dc.contributor.authorMahinovs, Aigars-
dc.contributor.authorTiwari, Ashutosh-
dc.contributor.authorRoy, Rajkumar-
dc.contributor.authorBaxter, David-
dc.date.accessioned2011-10-11T07:11:08Z
dc.date.available2011-10-11T07:11:08Z
dc.date.issued2007-04-01T00:00:00Z-
dc.description.abstractWith the explosion of information fuelled by the growth of the World Wide Web it is no longer feasible for a human observer to understand all the data coming in or even classify it into categories. With this growth of information and simultaneous growth of available computing power automatic classification of data, particularly textual data, gains increasingly high importance. This paper provides a review of generic text classification process, phases of that process and methods being used at each phase. Examples from web page classification and spam classification are provided throughout the text. Principles of operation of four main text classification engines are described – Naïve Bayesian, k Nearest Neighbours, Support Vector Machines and Perceptron Neural Networks. This paper will look through the state of the art in all these phases, take note of methods and algorithms used and of different ways that researchers are trying to reduce computational complexity and improve the precision of text classification process as well as how the text classification is used in practice. The paper is written in a way to avoid extensive use of mathematical formulae in order to be more suited for readers with little or no background in theoretical mathematien_UK
dc.identifier.isbn978-1-86194-128-2-
dc.identifier.urihttp://dspace.lib.cranfield.ac.uk/handle/1826/1860
dc.subjectText classificationen_UK
dc.subjectBayesen_UK
dc.subjectkNNen_UK
dc.subjectSVMen_UK
dc.subjectNeural networken_UK
dc.subjectFeature extractionen_UK
dc.subjectFeature reductionen_UK
dc.subjectWeb page classificationen_UK
dc.titleText classification method reviewen_UK
dc.typeReport-

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
mahinovs.pdf
Size:
326.27 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
18 B
Format:
Plain Text
Description: