2012年9月13日星期四

UCSC Source Tree Utilities

1.
programs for sorting, splitting, or merging fasta sequences;
record parsing and data conversion using GenBank, fasta, nib, and blast data formats;
sequence alignment;
motif searching;
hidden Markov model development; and much more.

Library subroutines are available for everything from managing C data structures such as linked lists, balanced trees, hashes, and directed graphs to developing routines for SQL, HTML, or CGI code. Additional library functions are available for biological sequence and data manipulation tasks such as reverse complementation, codon and amino acid lookup and sequence translation, as well as functions specifically designed for extracting, loading, and manipulating data in the UCSC Genome Browser Databases.

2. install it in Debian/Ubuntu
http://genomewiki.ucsc.edu/index.php/Source_tree_compilation_on_Debian/Ubuntu

3.

Compiling UCSC Source Tree Utilities on OSX


1) MySQL is the prerequisite for our building. Although it is installed on Rice SUG@R cluster, I cannot find the file “libmysqlclient.a” on SUG@R, which will be used in UCSC Kent Utilities building. So I have to installed a copy of MySQL in my home directory. The newest version of MySQL is 5.5.24 and it can be downloaded here. I downloaded the source code version for Generic Linux (Architecture Independent). The current version of MySQL need CMake to compile, so before we can do anything with MySQL, we need to have to back up a step and install CMake first.
2tar -xzvf cmake-2.8.8.tar.gz
3cd cmake-2.8.8
4> ./configure --prefix=/users/NetID/local/
5> gmake
6make install
Then install MySQL using CMake:
1tar xvzf mysql-5.5.24.tar.gz
2cd mysql-5.5.24
3mkdir -p /users/NetID/local/mysql
4> cmake -DCMAKE_INSTALL_PREFIX=/users/NetID/local/mysql
5make
6make install
Of course, we still need more work to configure MySQL to let it work well. But since we have MySQL support on SUG@R already and only need a file from this installation, so we can take care of those configuration steps later when we actually need to run our local copy of MySQL.
2) Now we can finally work on building UCSC Kent Utilities. Check the shell environment first. We need to change the value of MACHTYPE to x86_64 for our build and also set this in the .bashrc file under my home directory.
1echo $MACHTYPE
2> x86_64-redhat-linux-gnu
3> MACHTYPE=x86_64
4> emacs .bashrc
5export MACHTYPE=x86_64 # add this line into the .bashrc file
3) Under the $Home/bin/ directory, create a directory named x86_64 for those binary files generated during our compiling.
1mkdir -p $HOME/bin/${MACHTYPE}
4) Create the MySQL shell environment variables we need:
1> MYSQLINC=/usr/include/mysql
2> MYSQLLIBS="/users/NetID/local/mysql/lib/libmysqlclient.a -lz"
3export MYSQLINC MYSQLLIBS
5) Now we are ready to compile those utilities. Unzip the UCSC source code we downloaded and we will see a resulting directory named kent.
1> unzip jksrc.zip
2cd kent/src
3make libs
4cd utils/
5make
6) Then we are done. All those complied binary utilities can be found in $HOME/bin/$MACHTYPE directory. For me, it is /users/NetID/bin/x86_64/. Just try one, say faSplit. After typing faSplit in shell, we see the usage for this utility. So our build works!
01cd $HOME/bin/$MACHTYPE
02> ./faSplit
03> faSplit - Split an fa file into several files.
04> usage:
05>   faSplit how input.fa count outRoot
06> where how is either 'about' 'byname' 'base' 'gap' 'sequence' or'size'.
07> Files split by sequence will be broken at the nearest fa record boundary.
08> Files split by base will be broken at any base.
09> Files broken by size will be broken every count bases.
10...

没有评论:

发表评论