1.
programs for sorting, splitting, or merging fasta sequences;
record parsing and data conversion using GenBank, fasta, nib, and blast data formats;
sequence alignment;
motif searching;
hidden Markov model development; and much more.
Library subroutines are available for everything from managing C data structures such as linked lists, balanced trees, hashes, and directed graphs to developing routines for SQL, HTML, or CGI code. Additional library functions are available for biological sequence and data manipulation tasks such as reverse complementation, codon and amino acid lookup and sequence translation, as well as functions specifically designed for extracting, loading, and manipulating data in the UCSC Genome Browser Databases.
2. install it in Debian/Ubuntu
http://genomewiki.ucsc.edu/index.php/Source_tree_compilation_on_Debian/Ubuntu
3.
1)
MySQL is the prerequisite for our building. Although it is installed on Rice SUG@R cluster, I cannot find the file “libmysqlclient.a” on SUG@R, which will be used in UCSC Kent Utilities building. So I have to installed a copy of MySQL in my home directory. The newest version of MySQL is 5.5.24 and it can be downloaded
here. I downloaded the source code version for Generic Linux (Architecture Independent). The current version of MySQL need
CMake to compile, so before we can do anything with MySQL, we need to have to back up a step and install CMake first.
2 | > tar -xzvf cmake-2.8.8. tar .gz |
4 | > ./configure --prefix=/ users /NetID/ local / |
Then install MySQL using CMake:
1 | > tar xvzf mysql-5.5.24. tar .gz |
3 | > mkdir -p / users /NetID/ local /mysql |
4 | > cmake -DCMAKE_INSTALL_PREFIX=/ users /NetID/ local /mysql |
Of course, we still need more work to configure MySQL to let it work well. But since we have MySQL support on SUG@R already and only need a file from this installation, so we can take care of those configuration steps later when we actually need to run our local copy of MySQL.
2) Now we can finally work on building UCSC Kent Utilities. Check the shell environment first. We need to change the value of MACHTYPE to x86_64 for our build and also set this in the .bashrc file under my home directory.
2 | > x86_64-redhat-linux-gnu |
5 | > export MACHTYPE=x86_64 |
3) Under the $Home/bin/ directory, create a directory named x86_64 for those binary files generated during our compiling.
1 | > mkdir -p $HOME/bin/${MACHTYPE} |
4) Create the MySQL shell environment variables we need:
1 | > MYSQLINC=/usr/include/mysql |
2 | > MYSQLLIBS= "/users/NetID/local/mysql/lib/libmysqlclient.a -lz" |
3 | > export MYSQLINC MYSQLLIBS |
5) Now we are ready to compile those utilities. Unzip the UCSC source code we downloaded and we will see a resulting directory named kent.
6) Then we are done. All those complied binary utilities can be found in $HOME/bin/$MACHTYPE directory. For me, it is /users/NetID/bin/x86_64/. Just try one, say faSplit. After typing faSplit in shell, we see the usage for this utility. So our build works!
01 | > cd $HOME/bin/$MACHTYPE |
03 | > faSplit - Split an fa file into several files. |
05 | > faSplit how input.fa count outRoot |
06 | > where how is either 'about' 'byname' 'base' 'gap' 'sequence' or 'size' . |
07 | > Files split by sequence will be broken at the nearest fa record boundary. |
08 | > Files split by base will be broken at any base. |
09 | > Files broken by size will be broken every count bases. |
没有评论:
发表评论