AXEL-F - Standalone version 1.1.0 ========================================== Introduction ------------ This package includes AXEL-F prediction tool, which improves epitope prediction by taking account of both peptide-MHC binding affinities and expression levels of the peptide's source protein. The collection contains python scripts to run on linux-based environments and Dockerfile that allows user to create image containing AXEL-F tool. Prerequisites: ------------- + Python 3.6 or higher * http://www.python.org/ + tcsh * http://www.tcsh.org/Welcome - Under ubuntu: sudo apt-get install tcsh Installation (Linux environment): -------------------------------- Unpack the tar.gz files (IEDB_AXELF-VERSION.tar.gz) Install packages in 'requirements' to install packages that are necessary for AXEL-F. $ tar -xzvf IEDB_AXELF-VERSION.tar.gz $ cd axelf $ pip install -r requirements.txt Installation (Mac OS or others): ------------------------------- Under Mac OS, Axelf won't work properly. The workaround is to run docker container for this. $ tar -xzvf IEDB_AXELF-VERSION.tar.gz $ cd axelf $ docker build -t axelf_img . Help: ---- On Linux : python run_axelf.py -h` or `python run_axelf.py --help Container : docker run --rm axelf_img python run_axelf.py --help View all available alleles for Axelf: ------------------------------------ On Linux : python run_axelf.py -p Container : docker run --rm axelf_img python run_axelf.py -p CSV Examples: ------------ When providing CSV file as an input for Axelf, it must be in a valid CSV format. A valid CSV file will contain a header and have no missing data or empty cell in each row. 1. CSV file with peptide sequences only. The simplest CSV file you can have is a file containing peptide information only. Ex) peptide ADMGHLKY ELDDTLKY FMDHVLRY FSDLPLRV When providing with the above example, allele information must be provided along with either TPM value or TCGA data. Here are some of the command options... * Providing allele with TPM value. On Linux : python run_axelf.py tests/data/input/peptide_input.csv -a "HLA-A*01:01" -t 3.0 Container : docker run --rm axelf_img python run_axelf.py tests/data/input/peptide_input.csv -a "HLA-A*01:01" -t 3.0 * Providing allele with TCGA data (MUST specify cancer type and gene name). On Linux : python run_axelf.py tests/data/input/peptide_input.csv -s tcga -c CESC -g TIGAR -a "HLA-A*01:01" Container : docker run --rm axelf_img python run_axelf.py tests/data/input/peptide_input.csv -s tcga -c CESC -g TIGAR -a "HLA-A*01:01" 2. CSV file with peptide sequences and allele name only. Axelf can also take in a CSV file with allele information along with the peptide information. In this case, allele flag is not necessary unless you want to override the allele in the CSV file. Ex) peptide,allele ADMGHLKY,HLA-A*01:01 ELDDTLKY,HLA-A*01:01 FMDHVLRY,HLA-A*01:01 FSDLPLRV,HLA-A*01:01 * Providing with TPM value. On Linux : python run_axelf.py tests/data/input/peptide_allele_input.csv -t 3.0 Container : docker run --rm axelf_img python run_axelf.py tests/data/input/peptide_allele_input.csv -t 3.0 * Providing with TCGA data (MUST specify cancer type and gene name). On Linux : python run_axelf.py tests/data/input/peptide_allele_input.csv -s tcga -c CESC -g TIGAR Container : docker run --rm axelf_img python run_axelf.py tests/data/input/peptide_allele_input.csv -s tcga -c CESC -g TIGAR 3. CSV file provided with peptide sequences, allele, and TPM value. When choosing to utilize TPM value, you may have TPM column inside the CSV as well. Ex) peptide,allele,tpm ADMGHLKY,HLA-A*01:01,122.985 ELDDTLKY,HLA-A*01:01,34.705 FMDHVLRY,HLA-A*01:01,16.2825 FSDLPLRV,HLA-A*01:01,3.7025 * When using such CSV file, no flag needs to be specified. On Linux : python run_axelf.py tests/data/input/sample_input.csv Container : docker run --rm axelf_img python run_axelf.py tests/data/input/sample_input.csv 4. CSV file provided with peptide sequences, allele, and gene name. If each peptide needs to be provided with different genes, simply add a column containing gene names. Ex) allele,peptide,gene name HLA-B*52:01,EGMKTQYSV,RP11-368I23.2 HLA-C*12:02,YLASLHPRL,RP11-167B3.1 HLA-A*26:01,ELFQGSDLGV,RP11-742D12.2 HLA-B*38:01,LRDDKDNIERL,RAB4B * Providing with TCGA data by specifying cancer type only. On Linux : python run_axelf.py tests/data/input/sample_input2.csv -s tcga -c CESC Container : docker run --rm axelf_img python run_axelf.py tests/data/input/sample_input2.csv -s tcga -c CESC FASTA Examples: -------------- When providing FASTA file as an input for Axelf, it must be in a valid FASTA format. A valid FASTA file, according to the NIH, will be a single-line description followed by lines of sequence data. The single-line description will be distinguished from sequence data by ">" symbol at the beginning. Ex) >SEQUENCE_1 MTEITAAMVKELRESTGAGMMDCKNALSETNGDFDKAVQLLREKGLGKAAKKADRLAAEG LVSVKVSDDFTIAAMRPSYLSYEDLDMTFVENEYKALVAELEKENEERRRLKDPNKPEHK IPQFASRKQLSDAILKEAEEKIKEELKAQGKPEKIWDNIIPGKMNSFIADNSQLDSKLTL MGQFYVMDDKKTVEQVIAEKEKEFGGKIKIVEFICFEVGEGLEKKTEDFAAEVAAQL Also, note that whenever using a FASTA input for Axelf, length must be specified using "-l" or "--peptide-length" flag. This will allow FASTA sequence to break up into kmers (peptides with length 'k') and process through. 1. Providing with TPM value. On Linux : python run_axelf.py tests/data/input/sample_input.fasta --tpm 3.0 -l 8 -a "HLA-A*01:01" Container : docker run --rm axelf_img python run_axelf.py tests/data/input/sample_input.fasta --tpm 3.0 -l 8 -a "HLA-A*01:01" 2. Providing with TCGA data (MUST specify cancer type and gene name). On Linux : python run_axelf.py tests/data/input/sample_input.fasta -s tcga -c CESC -g TIGAR -l 8 -a "HLA-C*12:02" Container : docker run --rm axelf_img python run_axelf.py tests/data/input/sample_input.fasta -s tcga -c CESC -g TIGAR -l 8 -a "HLA-C*12:02"