文章作者通过分析 dbSNPv138 and ESP6500 数据库之后找到了 100个高频突变基因,然后跟其它几个数据库进行比较。文章题目是:FLAGS, frequently mutated genes in public exomes,https://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-014-0064-yWe used publicly available exome cohorts, together with the dbSNP database, to derive a list of genes (n = 100) that most frequently exhibit rare (<1%) non-synonymous/splice-site variants in general populations. We termed these genes FLAGS for FrequentLy mutAted GeneS and analyzed their properties.Name of datasetsSizeDescriptionFLAGS100The top 100 of FrequentLy mutAted GeneS with rare (<1% allelic frequency) functional variants from dbSNPv138 and ESP6500OMIM3099The list of protein-coding genes associated with human diseases from Online Mendelian Inheritance in Man [8]HGMD2691The list of protein-coding genes with damaging mutations (<1% allelic frequency) from Human Gene Mutation Database [28].WES300Downloaded from Boycott et al. (2013) [7] - a list of novel genes implicated in human disorders based on whole exome sequencing studies, or novel/known pathogenic mutations discovered by whole-exome sequencing.Background18580The entire set of human protein-coding genes that have complete start and end translation annotations with a specified dN/dS ratio几个特性Variants detected in FLAGS tend to be predicted as less deleteriousFLAGS tend to be reported in PubMed and associated with disease phenotypesFLAGS have significantly longer coding lengths, higher average dN/dS ratios, and more paralogs than genes from OMIM and HGMD. FLAGS recently implicated in rare-Mendelian disorders.FLAGS are less likely to be disease-associated作者还用 Tagxedo (http://www.tagxedo.com/) 工具对这100个基因做了一个词云。

第一列是基因的NCBI规定的entrez ID,第二列是symbol,第三列是基因全名!154664ABCA13ATP-binding cassette, sub-family A (ABC1), member 1324ABCA4ATP-binding cassette, sub-family A (ABC1), member 410347ABCA7ATP-binding cassette, sub-family A (ABC1), member 779026AHNAKAHNAK nucleoprotein113146AHNAK2AHNAK nucleoprotein 211214AKAP13A kinase (PRKA) anchor protein 137840ALMS1Alstrom syndrome protein 1288ANK3ankyrin 3, node of Ranvier (ankyrin G)338APOBapolipoprotein B259266ASPMabnormal spindle microtubule assembly675BRCA2breast cancer 2, early onset8927BSNbassoon presynaptic cytomatrix protein64072CDH23cadherin-related 39620CELSR1cadherin, EGF LAG seven-pass G-type receptor 1202333CMYA5cardiomyopathy associated 51293COL6A3collagen, type VI, alpha 31294COL7A1collagen, type VII, alpha 164478CSMD1CUB and Sushi multiple domains 18029CUBNcubilin (intrinsic factor-cobalamin receptor)25981DNAH1dynein, axonemal, heavy chain 1196385DNAH10dynein, axonemal, heavy chain 108701DNAH11dynein, axonemal, heavy chain 118632DNAH17dynein, axonemal, heavy chain 17146754DNAH2dynein, axonemal, heavy chain 255567DNAH3dynein, axonemal, heavy chain 31767DNAH5dynein, axonemal, heavy chain 556171DNAH7dynein, axonemal, heavy chain 71769DNAH8dynein, axonemal, heavy chain 81770DNAH9dynein, axonemal, heavy chain 9667DSTdystonin79659DYNC2H1dynein, cytoplasmic 2, heavy chain 183481EPPK1epiplakin 12195FAT1FAT atypical cadherin 12196FAT2FAT atypical cadherin 2120114FAT3FAT atypical cadherin 379633FAT4FAT atypical cadherin 484467FBN3fibrillin 38857FCGBPFc fragment of IgG binding protein2312FLGfilaggrin80144FRAS1Fraser extracellular matrix complex subunit 1341640FREM2FRAS1 related extracellular matrix protein 2NAGPR9885441HELZ2helicase with zinc finger 2, transcriptional coactivator8924HERC2HECT and RLD domain containing E3 ubiquitin protein ligase 283872HMCN1hemicentin 1388697HRNRhornerin3339HSPG2heparan sulfate proteoglycan 258508KMT2Clysine (K)-specific methyltransferase 2C8085KMT2Dlysine (K)-specific methyltransferase 2D284217LAMA1laminin, alpha 13908LAMA2laminin, alpha 23909LAMA3laminin, alpha 33911LAMA5laminin, alpha 54035LRP1low density lipoprotein receptor-related protein 153353LRP1Blow density lipoprotein receptor-related protein 1B4036LRP2low density lipoprotein receptor-related protein 223499MACF1microtubule-actin crosslinking factor 123195MDN1midasin AAA ATPase 14288MKI67marker of proliferation Ki-6794025MUC16mucin 16, cell surface associated140453MUC17mucin 17, cell surface associated4583MUC2mucin 2, oligomeric mucus/gel-forming727897MUC5Bmucin 5B, oligomeric mucus/gel-forming51168MYO15Amyosin XVA9172MYOM2myomesin 24703NEBnebulin84033OBSCNobscurin, cytoskeletal calmodulin and titin-interacting RhoGEF27445PCLOpiccolo presynaptic cytomatrix protein5116PCNTpericentrin9659PDE4DIPphosphodiesterase 4D interacting protein5310PKD1polycystic kidney disease 1 (autosomal dominant)168507PKD1L1polycystic kidney disease 1 like 15314PKHD1polycystic kidney and hepatic disease 1 (autosomal recessive)93035PKHD1L1polycystic kidney and hepatic disease 1 (autosomal recessive)-like 15339PLECplectin57674RNF213ring finger protein 21394137RP1L1retinitis pigmentosa 1-like 16261RYR1ryanodine receptor 1 (skeletal)6263RYR3ryanodine receptor 326278SACSsacsin molecular chaperone51332SPTBN5spectrin, beta, non-erythrocytic 523524SRRM2serine/arginine repetitive matrix 223166STAB1stabilin 155576STAB2stabilin 223345SYNE1spectrin repeat containing, nuclear envelope 123224SYNE2spectrin repeat containing, nuclear envelope 210579TACC2transforming, acidic coiled-coil containing protein 27011TEP1telomerase-associated protein 17038TGthyroglobulin7273TTNtitin23352UBR4ubiquitin protein ligase E3 component n-recognin 47399USH2AUsher syndrome 2A (autosomal recessive, mild)7402UTRNutrophin157680VPS13Bvacuolar protein sorting 13 homolog B (yeast)54832VPS13Cvacuolar protein sorting 13 homolog C (S. cerevisiae)55187VPS13Dvacuolar protein sorting 13 homolog D (S. cerevisiae)7450VWFvon Willebrand factor129446XIRP2xin actin binding repeat containing 27455ZANzonadhesin (gene/pseudogene)463ZFHX3zinc finger homeobox 3在作者发次文章的时候还是dbSNPv138 and ESP6500 数据库,现在已经更新到了dbSNPv155和gnomAD了吧!是不是可以重新这个流程呢?