본문 바로가기

Bioinformatics(생정보학)

mhcflurry 설치, 실행 및 결과 불러오기

728x90
반응형

1. 설치

 

pip install mhcflurry
mhcflurry-downloads fetch
# mhcflurry 실행을 위한 파일들이 있는 경로는 아래의 명령어로 확인 가능하다.
# mhcflurry-downloads path models_class1_presentation

 

위의 방식이 안될 때는 그냥 도커를 활용하자 아래는 도커파일이다.

이 url에서 먼저 데이터들을 다운 받는다

https://github.com/openvax/mhcflurry/

 

그후 dockerfile을 수정한다.

# mhcflurry dockerfile
FROM continuumio/miniconda3:latest

LABEL maintainer="Tim O'Donnell timodonnell@gmail.com"

WORKDIR /root
RUN mkdir /workdir

# Install / upgrade packages
RUN pip install --upgrade pip && pip install jupyter seaborn

# Install dependencies (doing this first to have them cached).
COPY requirements.txt /tmp/mhcflurry-requirements.txt
RUN pip install -r /tmp/mhcflurry-requirements.txt

# We pre-download the downloads here to avoid having to redownload them every
# time the source code changes (i.e. we do this before the COPY . so these
# downloads are part of the docker cache).
RUN mkdir /tmp/mhcflurry-downloads
COPY mhcflurry/downloads.yml /tmp/mhcflurry-downloads
RUN wget -P /tmp/mhcflurry-downloads \
    $(python -c 'import yaml ; d = yaml.safe_load(open("/tmp/mhcflurry-downloads/downloads.yml")) ; downloads = d["releases"][d["current-release"]]["downloads"] ; defaults = [item["url"] for item in downloads if item["default"]] ; print("\n".join(defaults))')

# Copy example notebook to current directory so it's easily found.
#COPY notebooks/* .

# Copy over source code and install mhcflurry.
COPY . mhcflurry
RUN pip install -e mhcflurry/
RUN mhcflurry-downloads fetch --already-downloaded-dir /tmp/mhcflurry-downloads

# Notebook activation off
#EXPOSE 9999
#CMD ["jupyter", "notebook", "--port=9999", "--no-browser", "--ip=0.0.0.0", "--allow-root", "--NotebookApp.token=''", "--NotebookApp.password=''"]

그리고 아래 명령어를 실행한다.

docker build -t osj-mhcflurry:latest .

 

이것도 안되면 그냥 아래거 써라

docker run -p 9999:9999 -v /path/to/workdir:/workdir --rm openvax/mhcflurry:latest

 

2. 실행 및 결과 불러오기

# 도커에서 실행할 명령어 제작
mhcflurry_docker_command="""
import pandas as pd
import mhcflurry
import warnings
params={
    'in':'/workdir/mhcflurry-input.pickle'
}
# Load a predictor
print('MHCflurry predictor loading')
warnings.simplefilter('ignore')
predictor = mhcflurry.Class1PresentationPredictor.load()
warnings.resetwarnings()
# Loading the files
indata={}
indata['in']=pd.read_pickle(params['in'])
# prediction
def mhcflurry_prediction(x):
    # 예측 실행
    df=pd.DataFrame([])
    for hla in x['mhc'].unique():
        peptide=list(x.loc[x['mhc']==hla,'peptide'].unique())
        hla2=hla.split('-')[1]
        res=predictor.predict(peptide, [hla2],include_affinity_percentile=True)
        res['mhc']=hla
        df=pd.concat([df,res],axis=0)
    df=df.reset_index(drop=1).drop(['peptide_num','sample_name'],axis=1)
    x1=pd.merge(x,df,on=['peptide','mhc'],how='left')
    return x1
pdata={}
pdata['pred']=mhcflurry_prediction(x=indata['in'])
pdata['pred'].to_csv('/workdir/mhcflurry-result.tsv',sep='\t')
# chmod777 부여
import subprocess as sbp
cmd='chmod 777 -R /workdir/*'
sbp.call(cmd,shell=True)
sbp.call('rm -rf /workdir/mhcflurry-input.pickle',shell=True)
print('파일 저장완료')
"""

# 도커에서 실행할 준비를 하는 명령어 실행
def running_mhcflurry(x,path,mhcflurry_docker_command):
    '''
    # parameters
    x : dataframe ['peptide','mhc','other columns;]
    path : /path/to/workdir
    mcflurry_docker_command : command line in the docker
    mhcflurry_docker : docker-name and tag
    # return
    dataframe with mhcflurry result
    '''
    import subprocess as sbp
    # Export the command line
    docker_runpy_path=path+'/mhcflurry-docker-run.py'
    with open(docker_runpy_path,'w') as f:
        f.write(mhcflurry_docker_command)
        f.close()
    # Export necessary things
    import pickle
    pickle_path=path+'/mhcflurry-input.pickle'
    with open(pickle_path,'wb') as f:
        pickle.dump(obj=x,file=f)
        f.close()
    print('Running the following command line')
    print(f"docker run -p 9999:9999 -v {path}:/workdir --rm openvax/mhcflurry:latest")
    print('Then modify the exmple1 and run the prediction')
    print('!python /workdir/mhcflurry-docker-run.py')
    print('mhcflurry-input is located at /workdir/mhcflurry-input.pickle')

 

노트북을 실행해서 example1.ipynb에서 아래의 명령어를 실행한다.

!python /workdir/mhcflurry-docker-run.py

이렇게 하면 결과물이 mhc-result.tsv로 떨어진다.

728x90
반응형