Carrot root and DNA VCRU Bioinformatics USDA ARS Vegetable Crops Research Unit

This page was last updated on Wednesday, 12-Sep-2012 09:58:41 CDT

Roche 454 Sequencing logo

454 Instructions - Index

Computer Requirements

Roche software runs in a Linux environment.
RedHat is the supported distribution, and Fedora is reported to work in some cases.

The VCRU has two computers with lots of memory and processors,
the Cucumber server (32GB of memory and 8 processors), and the Cranberry server (64GB of memory and 16 processors).
talk to Doug to get an account set up on one of these computers.

VNC (Remote desktop connection to server)

If you set up a vnc client, you can run the Roche programs (and others) from your own computer.
Instructions to do that are on this page

Roche 454 Software

Skip this if you will use the software on the Cranberry server

The Roche recommended platform is RedHat linux.
To install on Fedora 13, there are a few things to do first, see this page for more information

Basic install steps are:
change to a temporary working directory cd /tmp
download the installer with wget --user=xxx --password=xxx http://www.vcru.wisc.edu/simonlab/bioinformatics/up/download/DataAnalysis_2.3.tgz
where --user and --password provide access to the password protected sections of this web site (see Doug for password)
uncompress with tar -zxvf DataAnalysis_2.3.tgz
install with cd DataAnalysis_2.3 sudo ./INSTALL

Madison WI Users' Contact Information for Roche

Roche Technical Support Contact Information

Our local Roche representative:

Dan Brekken, M.S.
Key Account Manager-Sequencing
Cell 1-303-941-3155
dan.brekken@roche.com

Illustration of gsAssembler 2.6 bug

Getting your 454 sequences from Biotech

  1. You will get 454 sequences the same way as you have been getting Sanger sequences, on the UW Biotech Center web server at
    https://facilities.biotech.wisc.edu/download
    Just login using your own UW password and netID.
    Note that instead of being in the DNA Sequencing folder, they will be in a folder named Advanced Genome Analysis Resource
  2. You need to get this file to a linux computer with the Roche software. Here are some ways to do that
    1. Download using Firefox while working on the linux computer (easiest)
    2. Copy using the "share" network accessible folder on the Cranberry server
    3. Copy to AFS and then access AFS from the linux server
  3. Where to put them?
    Don't put them in your home directory, there is not space there and it is not backed up automatically. I have lots of space available on the data drives. And these are backed up nightly to two different computers.
    The Cranberry server has 1.5 TB on /vmdata1 and 2TB on /vmdata2, make a folder on one of these.
    I already have /vmdata1/454 created.
    and within that directory I am dividing sequence data by species, into cranberry, carrot, onion, and cucumber
    (If you are wondering, the vm in vmdata stands for Vaccinium macrocarpon, i.e. cranberry)
  4. Uncompress the file
    1. Uncompress a .gz archive with gunzip < yourfilename.gz > yourfilename
      (this keeps the original .gz file)
    2. or uncompress a .tgz or .tar.gz archive with tar -zxvf yourfilename.tar.gz
      or, maybe easiest, uncompress in File Browser by right clicking on the file and selecting Extract Here
  5. Safety: I recommend you make the original sequence files read-only so that it is a little bit harder to accidentally delete them. Do that with this command:
    chmod -w *.sff

Assembly

I have added the Roche software to the "Bioinformatics" menu for all users.
For example, to run gsAssembler:
bioinformatics menu, gsAssembler sub menu

The Roche documentation is in this file (password protected, see Doug for password):
Manual is on this page

Secrets!

Here are a few things I have learned that are not immediately obvious:

Mapping

Run gsMapper from the "Bioinformatics" menu

The Roche documentation is in this file (password protected, see Doug for password):
Manual is on this page

After Assembly

Look at this page for some other things you can do after assembly