2016年6月26日 星期日

The rise of CRISPR

玩一下新玩具 rentrez




使用rstudio的rentrez查詢ncbi提供的資料庫

首先當然要安裝好 rstudio ,請參考本站2015年7月的舊文

然後安裝 rentrez

install.packages("rentrez")

啟用 rentrez

library("rentrez", lib.loc="/usr/local/lib/R/site-library")



以下操作參考自 https://cran.r-project.org/web/packages/rentrez/vignettes/rentrez_tutorial.html

> library("rentrez", lib.loc="/usr/local/lib/R/site-library")
> entrez_dbs()
 [1] "pubmed"          "protein"      
 [3] "nuccore"         "nucleotide"  
 [5] "nucgss"          "nucest"      
 [7] "structure"       "genome"      
 [9] "annotinfo"       "assembly"    
[11] "bioproject"      "biosample"    
[13] "blastdbinfo"     "books"        
[15] "cdd"             "clinvar"      
[17] "clone"           "gap"          
[19] "gapplus"         "grasp"        
[21] "dbvar"           "gene"        
[23] "gds"             "geoprofiles"  
[25] "homologene"      "medgen"      
[27] "mesh"            "ncbisearch"  
[29] "nlmcatalog"      "omim"        
[31] "orgtrack"        "pmc"          
[33] "popset"          "probe"        
[35] "proteinclusters" "pcassay"      
[37] "biosystems"      "pccompound"  
[39] "pcsubstance"     "pubmedhealth"
[41] "seqannot"        "snp"          
[43] "sra"             "taxonomy"    
[45] "unigene"         "gencoll"      
[47] "gtr"          
> entrez_db_summary("geoprofiles")
 DbName: geoprofiles
 MenuName: GEO Profiles
 Description: Genes Expression Omnibus
 DbBuild: Build141002-1115.90
 Count: 108708851
 LastUpdate: 2016/06/21 04:48 

> entrez_db_searchable("geoprofiles")
Searchable fields for database 'geoprofiles'
  ALL All terms from all searchable fields
  UID Unique number assigned to publication
  FILT Limits the records
  ORGN Exploded organism names
  ACCN Accession for GDS (DataSet), GPL (Platform), GSM (Sample), GSE (Series)
  GDST GDS text from title and description
  GEOT Sample titles
  RTYP Platform reporter type, e.g. genbank, clone, orf
  GTYP Type of dataset
  VTYP Sample value type, e.g. log ratio, count
  NSAM Number of samples
  SRC Sample source
  ID Spot ID from GEO Platform, SAGE tag, Affy ProbeSet ID
  NAME Name or identifier for the spot, e.g. GenBank accession, CLONE_ID, ORF etc.
  SYMB Gene symbol (name) from Entrez-Gene or Entrez-UniGene.
  GDSC Gene Description
  RSTD Ranked standard deviation
  RMAX Maximal value of ranks
  RMIN Minimal value of ranks
  FINF Indicates an interesting or notable uid in the GDS context
  FTYP Type of flag that indicates a uid of interest, or outliers etc.
  GI GenBank Identifier
  ATYP Type of annotation (gene, unigene, nucleotide)
  GO Gene Ontology
  CHR Chromosomes
  CPOS Chromosome base position 

2016年6月17日 星期五

腳本語言強勢回歸

下圖截自 http://www.tiobe.com/tiobe_index



python, perl, ruby 三劍客同時擠到10名內,相對的 C 的rating降了不少。其實本該如此不是嗎? 一般人要處理的工作,應該要用腳本語言就能很快處理,對應到C/C++,不知道要寫多少倍程式行、多花多少時間。

話說 C# 和 objective-C 看來也危險了…