k Nearest Neighbors classifier with sampling dependent decision rules

dc.contributor.advisorKovács, György
dc.contributor.advisorSzeghalmy, Szilvia
dc.contributor.authorLászló, Zsolt
dc.contributor.departmentDE--Informatikai Karhu_HU
dc.date.accessioned2017-05-02T07:23:45Z
dc.date.available2017-05-02T07:23:45Z
dc.date.created2017-04-20
dc.description.abstractThe k Nearest Neighbors (kNN) method is a widely used technique to solve classification or regression problems in machine learning and data science. Compared to other methods like Support Vector Machines or Neural Networks, kNN has extremely low number of parameters, reducing the chances of overfitting when the number of training vectors is relatively small. Consequently, kNN has many practical applications, especially in fields where the available training data is limited or the acquisition of data is expensive. However, in many machine learning related problems various circumstances can make the operation of the original kNN less accurate. Such circumstances may arise due to the unbalanced class sizes, to the differing densities of training vectors or to the noisy entities present in most databases. In this study, we introduce novel, local decision rules that also take into consideration possible sampling issues. The proposed model uses only the k nearest neighbors already chosen for classification and executes an algorithm with O(k^2) time complexity that can be considered efficient until k is relatively low. The model was evaluated on the widely used test databases of classification and based on the test results we can state that the proposed decision rule is able to increase the accuracy of classification in various problems.hu_HU
dc.description.courseProgramtervező informatikushu_HU
dc.description.degreeMSc/MAhu_HU
dc.format.extent34hu_HU
dc.identifier.urihttp://hdl.handle.net/2437/238881
dc.language.isoenhu_HU
dc.subjectkNN, k Nearest Neighbors, imbalanced sampling, classificationhu_HU
dc.subject.dspaceDEENK Témalista::Informatikahu_HU
dc.titlek Nearest Neighbors classifier with sampling dependent decision ruleshu_HU
Fájlok