본문 바로가기

R관련/Rfunction

domain_annotation

728x90
반응형
#------------------
# Protein domain annotation
#------------------
domain_annot=function(pfam_id=c()){
  if(is.null(pfam_id)){
    print('pfam_id : input character vector for pfam_id')
    print('use biomart to get pfam id')
    print('try bellow')
    cat("ensembl=useMart(biomart='ensembl',dataset = 'hsapiens_gene_ensembl')\n")
    cat("tmp=getBM(attributes = c('hgnc_symbol','ensembl_peptide_id','pfam','pfam_start','pfam_end'),
          filters= 'ensembl_peptide_id',values='ENSP00000428056',mart=ensembl)")
  }else{
    library('rjson');library('RCurl')
    ids=pfam_id
    id2=unlist(mclapply(ids,function(x){
      print(x)
      url=getURL(paste0('http://pfam.xfam.org/family/',x))
      info=unlist(strsplit(url,"\""))
      info=info[grep(info,pattern = 'Summary:')]
      info=unlist(strsplit(info,": ",fixed = T))[2]
      info=unlist(strsplit(info,"<",fixed = T))[1]
      }))
    print('Conversion done')
    if(length(id2)==length(ids)){
      names(id2)=ids
    }else{
      warning('Some of pfam ids have no appropriate ids\nConversion failed')
    }
    return(id2)
  }
}


728x90
반응형

'R관련 > Rfunction' 카테고리의 다른 글

survival analysis function  (0) 2019.03.21
GSEA Enrichment Score calculation  (0) 2019.02.27
scanning ks-test  (0) 2018.03.02
IC50, drc  (0) 2017.07.21
multi_grep  (0) 2017.07.21