Citation:
Hongmei He, Tim Watson, Carsten Maple Jörn Mehnen, Ashutosh Tiwari. A new semantic attribute deep learning with a linguistic attribute hierarchy for spam detection. International Joint Conference on Neural Networks, 14-19 May 2017, Anchorage, Alaska. USA.
Abstract:
The massive increase of spam is posing a very
serious threat to email and SMS, which have become an important
means of communication. Not only do spams annoy users, but
they also become a security threat. Machine learning techniques
have been widely used for spam detection. In this paper, we
propose another form of deep learning, a linguistic attribute
hierarchy, embedded with linguistic decision trees, for spam
detection, and examine the effect of semantic attributes on the
spam detection, represented by the linguistic attribute hierarchy.
A case study on the SMS message database from the UCI machine
learning repository has shown that a linguistic attribute hierarchy
embedded with linguistic decision trees provides a transparent
approach to in-depth analysing attribute impact on spam
detection. This approach can not only efficiently tackle ‘curse
of dimensionality’ in spam detection with massive attributes,
but also improve the performance of spam detection when the
semantic attributes are constructed to a proper hierarchy.