Pfliegler, Valter PéterNémeth, BálintAllouh, Fares Raja Farah2024-12-172024-12-172024-11-14https://hdl.handle.net/2437/383183An in silico experiment in which the sequencing data of chromosome 1 from a multitude of S. cerevisiae strains was utilized. The compendium of yeasts created by the Department of Molecular Biotechnology and Microbiology provided the data for many S. cerevisiae strains and their different sequencing runs (replicates). Variants from same-strain replicates were called and combined into VCF files which were then subjected to hard-filtering, and filtering by a convolutional neural network. The study describes a pipeline which utilizes the Genome Analysis Toolkit (GATK) for variant calling and filtration. Finally, the efficacies of both methods are compared, and their strengths and weaknesses are highlighted.39enBioinformaticsSaccharomyces cerevisiaeNext-generation sequencingGATKApplying Machine Learning for VCF file filteringBiology::BiotechnologyHozzáférhető a 2022 decemberi felsőoktatási törvénymódosítás értelmében.