2012年1月29日星期日

Transform list to matrix and convert back

http://cos.name/cn/topic/105864

http://cos.name/cn/topic/101372

I do not agree with him


Cultural history holds back Chinese research


We can not freely choose culture for ourself in China now. Everything has been controlled, especially those of culture. So how do you conclude as this, Peng Gong, a so-called intellectual in China? It is not the right time to think if cultural hold back research. It is the time for us to think how/at what level the communism politic depleted our freedom of ....

Do you know what happened in the period of the Cultural Evolution? We have already no culture for at least tens of years. 

Python for Non-Programmers


(1) Python for Non-Programmers

(2) tutorial from Pasteur Institute 

http://www.pasteur.fr/formation/infobio/python/

(3) a guide from PLoS Computational Biology

(4) programming for biologists

Programming Resources

ANGUS

ANGUS, a number of detailedtutorials on mappingassemblymRNAseqChIP-seqand resequencing analysis


(1) how to do basic scientific data analysis on UNIX
  • UNIX, ssh, and scp
  • Renting a computer from Amazon
  • Running BLASTs on UNIX
  • UNIX, BLAST, and long-running jobs
  • Working with CSV files and Python
  • Plotting with matplotlib
  • Scripts, Reciprocal Best-Hits BLAST, and some more Python
  • Mapping with bowtie
  • Visualizing mappings with Samtools
  • Other mappers
  • Short Read Assembly
  • Visualizing NGS data on UCSC genome browser
  • Simple mRNAseq: mapping reads to gene sets and doing quantile normalization
  • Gene differential expression using DEGseq
  • Analyzing bacterial resequencing data with breseq
  • Storing data persistently with Amazon
  • Writing Python scripts incrementally
  • Aligning ChIP-seq reads and detecting enriched peaks
  • Using MEME to identify TF binding motif from ChIP-seq data
  • Slightly more advanced scripting with Python
  • Using the Nano editor
  • Using ‘top’ to monitor running jobs
  • Setting a password rather than using your key


  • (2) Course Schedule (2010)

    Day 1 (Tue, June 1)

    Lecture: Introduction to the course (Titus Brown)
    Tutorial 1: UNIX, ssh, and scp
    An Exercise in Reflection (Stage 1)

    Day 2 (Wed, June 2)

    Lecture: Computational Basics (Titus Brown)

    Day 3 (Th, June 3)

    Lecture: Thinking Statistically (Ian Dworkin)
    Entertainment: Brewery outing for dinner.
    (No tutorial #3.)

    Day 4 (Fri, June 4)

    Lecture: Mapping Reads to Known Genomes (Titus Brown)
    Tutorial 1: Mapping with bowtie
    Tutorial 3: Other mappers and Bowtie parameters
    Bonfire!

    Day 5 (Sat, June 5)

    Lecture: Assembly. (Titus Brown)
    Tutorial 1: Short Read Assembly (Jason)
    BBQ on Windmill Island.

    Day 6 (Sun, June 6)

    Take a break - day of rest! Swim, sleep, make merry.
    (Only lunch served.)

    Day 7 (Mon, June 7)

    Lecture: mRNAseq analysis. (Titus Brown)
    Tutorial 2 (1:30pm): Gene differential expression using DEGseq (Likit)
    Evening lecture: Doing science on software engineering (Greg Wilson)

    Day 8 (Tue, June 8

    Lecture: Resequencing analysis. (Jeff Barrick)
    Bonfire!

    Day 9 (Wed, June 9)

    Lecture, 8pm: ChIP-seq analysis. (Mark Robinson)

    Day 10 (Th, June 10)

    Tutorial, 1:30pm: more-advanced-python-scripting


    (3) tutorials of 2011
    http://ged.msu.edu/angus/tutorials-2011/index.html


    制作流程图 - linux, mac, windows

    http://www.yworks.com/en/index.html

    2012年1月28日星期六

    two papers on analyzing whole genome resequencing with objective of detecting selections

    This group output some good case of analyzing whole genome resequencing with objective of detecting selections.


    2011

    Matteo FumagalliManuela SironiUberto PozzoliAnna Ferrer-AdmettlaLinda PattiniRasmus NielsenSignatures of environmental genetic adaptation pinpoint pathogens as the main selective pressure throughhuman evolution. PLoS Genetics in Press Supplementary material, 给出了数据和 Rcode。
    Cagliani R, Riva S, Pozzoli U, Fumagalli M, Comi GP, Bresolin N, Clerici M, Sironi M. Balancing selection is common in the extended MHC region but most alleles with opposite risk profile for autoimmune diseases are neutrally evolving. BMC Evolutionary Biology 11:171
    Common population genetic tests based on the site frequency spectrum (SFS) include Tajima's D (DT[14] and Fu and Li's D* and F* [15]. Dtests the departure from neutrality by comparing two nucleotide diversity indexes: θ[16], an estimate of the expected per site heterozigosityand π[17], the average number of pairwise sequence nucleotide differences. Positive values of DTindicate an excess of intermediate frequency variants and are a signature of balancing selection.Fu and Li's Fand Dare also based on SNP frequency spectra and differ from Din that they also take into account whether mutations occur in external or internal branches of a genealogy[15]. As an empirical comparison, θWπ, as well as DTFand Dwere calculated for 5 kbwindows (thereafter referred to as reference windowsderiving from 238 genes resequenced bythe NIEHS program in CEUAdditionallythe statistical significance of neutrality tests wasevaluated by performing coalescent simulations with a population genetic model that incorporatesdemographic scenarios [18].

    Rachele Cagliani, Matteo Fumagalli, Franca R. Guerini, Stefania Riva, Daniela Galimberti, Giacomo P. Comi, Cristina Agliardi, Elio Scarpini, Uberto Pozzoli and Diego Forni, et al. Identification of a new susceptibility variant for multiple sclerosis in OAS1 by population genetics analysis. Human Genetics Jul 7
    Cereda M, Sironi M, Cavalleri M, Pozzoli U. GeCo++: a C++ library for genomic features computation and annotation in the presence of variants. Bioinformatics 2011 Mar 12. [Epub ahead of print]
    Magri F, Del Bo R, D'Angelo MG, Govoni A, Ghezzi S, Gandossini S, Sciacco M, Ciscato P, Bordoni A, Tedeschi S, Fortunato F, Lucchini V, Cereda M, Corti S, Moggio M, Bresolin N, Comi GP. Clinical and molecular characterization of a cohort of patients with novel nucleotide alterations of the Dystrophin gene detected by direct sequencing. BMC Med Genet. 2011 Mar 11;12:37.
    Tollervey JR, Curk T, Rogelj B, Briese M, Cereda M, Kayikci M, Konig J, Hortobágyi T, Nishimura AL, Zupunski V, Patani R, Chandran S, Rot G, Zupan B, Shaw CE, Ule J. Characterizing the RNA targets and position-dependent splicing regulation by TDP-43. Nat Neurosci. 2011 Apr;14(4):452-8.
    Cagliani R, Fruguglietti ME, Berardinelli A, D'Angelo MG, Prelle A, Riva S, Napoli L, Gorni K, Orcesi S, Lamperti C, Pichiecchio A, Signaroldi E, Tupler R, Magri F, Govoni A, Corti S, Bresolin N, Moggio M, Comi GP. New molecular findings in congenital myopathies due to selenoprotein N gene mutations. J Neurol Sci.

    important reference of computational/statistical application in Evo and Eco

    These references are mostly recommended by my friend, Jinlong.

    1. Phylogeography and Phylogenetics
    http://www.utsc.utoronto.ca/~jweir/

    2. Statistical method in Ecology
    http://www.unc.edu/courses/2010fall/ecol/563/001/index.html
    http://www.unc.edu/courses/2010fall/ecol/563/001/docs/lectures.html

    3. principals of phylogenetics
    http://ib.berkeley.edu/courses/ib200b/IB200B_SyllabusHandouts.shtml

    4. phylogenetic comparative methods
    http://www2.unil.ch/phylo/teaching/pmc.html

    5. Bodega phylogenetic wiki
    http://bodegaphylo.wikispot.org/Topics

    6. wild evolution group
    http://wildevolution.biology.ed.ac.uk/

    7. Quantitative Methods in Ecology and Evolution
    http://www.zoology.ubc.ca/~schluter/bio548/

    8. how to a quantitative ecologist
    http://greenmaths.st-andrews.ac.uk/index.aspx