Protein active site prediction for early drug discovery and designing

dc.contributor.authorYousaf, Aqsa
dc.contributor.authorShehzadi, Tahira
dc.contributor.authorFarooq, Aqeel
dc.contributor.authorIlyas, Komal
dc.contributor.statusnemhu_HU
dc.date.accessioned2021-12-20T13:57:34Z
dc.date.available2021-12-20T13:57:34Z
dc.date.issued2021-10-13
dc.description.abstractAdenosine triphosphate (ATP) is an energy compound present in living organisms and is required by living cells for performing operations such as replication, molecules transportation, chemical synthesis, etc. ATP connects with living cells through specialized sites called ATP-sites. ATP-sites are present in various proteins of a living cell. The life span of a cell can be controlled by controlling ATP compounds and without the provision of energy to ATP compounds, cells cannot survive. Countless diseases treatment (such as cancer, diabetes) can be possible once protein active sites are predicted. Considering the need for an algorithm that predicts ATP-sites with higher accuracy and effectiveness, this research work predicts protein ATP sites in a very novel way. Till now Position-specific scoring matrix (PSSM) along with many physicochemical properties have been used as features with deep neural networks in order to create a model that predicts the ATP-sites. To overcome this problem of complex computation, this exertion proposes k-mer feature vectors with simple machine learning (ML) models to attain the same or even better performance with less computation required. Using 2-mer as feature vectors, this research work trained and tested five different models including KNN, Conv1D, XGBoost, SVM and Random Forest. SVM gave the best performance on k-mer features. The accuracy of the created model is 96%, MCC 90% and ROC-AUC is 99%, which are the same or even better in some aspects than the state-of-the-art results. The state-of-the-art results have an accuracy of 97%, MCC 78% and ROC-AUC is 92%. One of the benefits of the created model is that it is much simpler and more accurate.hu_HU
dc.identifier.doi10.1556/1848.2021.00315hu_HU
dc.identifier.issn2062-0810
dc.identifier.issue1hu_HU
dc.identifier.jtitleInternational Review of Applied Sciences and Engineering
dc.identifier.urihttp://hdl.handle.net/2437/326519
dc.identifier.urlhttps://akjournals.com/view/journals/1848/13/1/article-p98.xmlhu_HU
dc.identifier.volume13hu_HU
dc.language.isoenhu_HU
dc.publisherAkadémiai Kiadóhu_HU
dc.subjectsequence-based featureshu_HU
dc.subjectATP-siteshu_HU
dc.subjectXGBoosthu_HU
dc.subjectConv1Dhu_HU
dc.subjectMCChu_HU
dc.subjectAUChu_HU
dc.subjectROChu_HU
dc.subjectATPhu_HU
dc.subjectPSSMhu_HU
dc.titleProtein active site prediction for early drug discovery and designinghu_HU
Fájlok
Eredeti köteg (ORIGINAL bundle)
Megjelenítve 1 - 1 (Összesen 1)
Nem elérhető
Név:
_20634269___International_Review_of_Applied_Sciences_and_Engineering__Protein_active_site_prediction_for_early_drug_discovery_and_designing.pdf
Méret:
1.07 MB
Formátum:
Adobe Portable Document Format
Engedélyek köteg
Megjelenítve 1 - 1 (Összesen 1)
Nem elérhető
Név:
license.txt
Méret:
2.57 KB
Formátum:
Item-specific license agreed upon to submission
Leírás: