Using KNN algorithm for classification of textual documents

Aiman Moldagulova, Rosnafisah Sulaiman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors (KNN) method for the classification of textual documents. The experimental part of the research was done on collected textual documents from two sources: http://egov.kz and http://www.government.kz. The experiment was devoted to challenging thing of the KNN algorithm that to find the proper value of k which represents the number of neighbors.

Original languageEnglish
Title of host publicationICIT 2017 - 8th International Conference on Information Technology, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages665-671
Number of pages7
ISBN (Electronic)9781509063321
DOIs
Publication statusPublished - 20 Oct 2017
Event8th International Conference on Information Technology, ICIT 2017 - Amman, Jordan
Duration: 17 May 201718 May 2017

Other

Other8th International Conference on Information Technology, ICIT 2017
CountryJordan
CityAmman
Period17/05/1718/05/17

Fingerprint

Learning systems
Supervised learning
Learning algorithms
Learning
Growth
Research
K-nearest neighbor
Experiments
Learning algorithm
Machine learning
Government
Text classification
Experiment
Machine Learning

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Health Informatics
  • Information Systems and Management
  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Moldagulova, A., & Sulaiman, R. (2017). Using KNN algorithm for classification of textual documents. In ICIT 2017 - 8th International Conference on Information Technology, Proceedings (pp. 665-671). [8079924] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICITECH.2017.8079924
Moldagulova, Aiman ; Sulaiman, Rosnafisah. / Using KNN algorithm for classification of textual documents. ICIT 2017 - 8th International Conference on Information Technology, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 665-671
@inproceedings{7342135fe26a4796ba60a038cc47a615,
title = "Using KNN algorithm for classification of textual documents",
abstract = "Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors (KNN) method for the classification of textual documents. The experimental part of the research was done on collected textual documents from two sources: http://egov.kz and http://www.government.kz. The experiment was devoted to challenging thing of the KNN algorithm that to find the proper value of k which represents the number of neighbors.",
author = "Aiman Moldagulova and Rosnafisah Sulaiman",
year = "2017",
month = "10",
day = "20",
doi = "10.1109/ICITECH.2017.8079924",
language = "English",
pages = "665--671",
booktitle = "ICIT 2017 - 8th International Conference on Information Technology, Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Moldagulova, A & Sulaiman, R 2017, Using KNN algorithm for classification of textual documents. in ICIT 2017 - 8th International Conference on Information Technology, Proceedings., 8079924, Institute of Electrical and Electronics Engineers Inc., pp. 665-671, 8th International Conference on Information Technology, ICIT 2017, Amman, Jordan, 17/05/17. https://doi.org/10.1109/ICITECH.2017.8079924

Using KNN algorithm for classification of textual documents. / Moldagulova, Aiman; Sulaiman, Rosnafisah.

ICIT 2017 - 8th International Conference on Information Technology, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2017. p. 665-671 8079924.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Using KNN algorithm for classification of textual documents

AU - Moldagulova, Aiman

AU - Sulaiman, Rosnafisah

PY - 2017/10/20

Y1 - 2017/10/20

N2 - Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors (KNN) method for the classification of textual documents. The experimental part of the research was done on collected textual documents from two sources: http://egov.kz and http://www.government.kz. The experiment was devoted to challenging thing of the KNN algorithm that to find the proper value of k which represents the number of neighbors.

AB - Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors (KNN) method for the classification of textual documents. The experimental part of the research was done on collected textual documents from two sources: http://egov.kz and http://www.government.kz. The experiment was devoted to challenging thing of the KNN algorithm that to find the proper value of k which represents the number of neighbors.

UR - http://www.scopus.com/inward/record.url?scp=85040006860&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85040006860&partnerID=8YFLogxK

U2 - 10.1109/ICITECH.2017.8079924

DO - 10.1109/ICITECH.2017.8079924

M3 - Conference contribution

SP - 665

EP - 671

BT - ICIT 2017 - 8th International Conference on Information Technology, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Moldagulova A, Sulaiman R. Using KNN algorithm for classification of textual documents. In ICIT 2017 - 8th International Conference on Information Technology, Proceedings. Institute of Electrical and Electronics Engineers Inc. 2017. p. 665-671. 8079924 https://doi.org/10.1109/ICITECH.2017.8079924