lncRNA组装流程的软件介绍之lncFinder / 四六文摘

咱们《生信技能树》的B站有一个lncRNA数据分析实战，缺乏配套笔记，所以我们安排了100个lncRNA组装案例文献分享，以及这个流程会用到的100个软件的实战笔记教程！

下面是100个lncRNA组装流程的软件的笔记教程

一、软件原理

LncFinder是一种新的lncRNA识别工具。基于六聚体的对数距离，多尺度结构信息和从快速离散傅立叶变换获得的理化特征。为了确定最佳分类器，使用10倍交叉验证对五种广泛使用的机器学习算法进行了验证：逻辑回归，支持向量机（SVM），随机森林，极限学习机器和深度学习。最终选择SVM作为LncFinder的分类器。经过全面的功能选择和模型验证方案的评估，LncFinder在多个物种上的表现优于几种最先进的工具。用户可以轻松，高效地使用新的数据集或不同的机器学习算法对LncFinder进行重新训练。

二、输入格式

fasta格式序列

三、软件使用

该软件既可以在本地运行，也提供了在线版本。

1. 在线版本

在线版本的网址如下

http://bmbl.sdstate.edu/lncfinder/

可以直接输入fasta格式的序列，选择指定的物种

2. 本地版本

本地安装

install.packages("lncFinder")

运行脚本

# Prediction without secondary structure-based features library(LncFinder) demo_DNA.seq <- seqinr::read.fasta("~/lncRNA_project/07.identification/step2/filter2_transcript_exon.fa")


Seqs <- demo_DNA.seq
### The first parameter of lnc_finder() indicates the unevaluated

### sequences; parameter "SS.features = FALSE" means predict sequences

### without secondary structure-based features; parameter 

### 'format = "DNA"' means the format of input sequences is DNA;

### parameters "frequencies.file" and "svm.model" indicate predicting

### sequences with which model; parameter "parallel.cores" means the

### number of cores to conduct parallel computing, and "-1" means 

### using all available cores.
result_1 <- LncFinder::lnc_finder(Seqs,

                                  SS.features = FALSE,

                                  format = "DNA",

                                  frequencies.file = "human",

                                  svm.model = "human", 

                                  parallel.cores = 1)

write.table (result_1, file ="~/lncRNA_project/07.identification/step3/lncFinder/lncFinder_result.txt", sep ="\t",row.names =TRUE, col.names =TRUE,quote =FALSE)

四、输出结果解读

根据第二列Coding/NonCoding 区分ncRNA和protein coding

文末友情推荐

与十万人一起学生信，你值得拥有下面的学习班：

lncRNA组装流程的软件介绍之lncFinder