Wednesday, September 26, 2012

How to quickly download Illumina iGenomes

To use Tophat/Cufflinks/Cuffdiff/Cummerbund you might need to download iGenomes from Illumina.
They can be found on the Tophat website http://tophat.cbcb.umd.edu/igenomes.html

But if you try to download human or mouse iGenomes (approximately 20Gb and 15Gb, respectively) through a browser the downloaded will start and then most probably it will block after some time.

Connect to the Illumina ftp server:

Either use terminal (Type in what is in bold, example: I downloaded mouse genome mm9):

C02HQ105DHJQ:~ mpjanic$ ftp
ftp> open ussd-ftp.illumina.com
Connected to ussd-ftp.illumina.com.
220 EFT Server 6.4.1 Build 12.19.2011.1
Name (ussd-ftp.illumina.com:mpjanic): igenome
331 Password required for igenome.
Password: G3nom3s4u
230 Login OK. Proceed.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> pwd
Remote directory: /
ftp> cd Mus_musculus
250 Folder changed to "/Mus_musculus".
ftp> ls
229 Entering Extended Passive Mode (|||56930|).
150 Opening ASCII mode data connection for file list.
dr-xr--r--   1 user     group           0 May 11 14:00 Ensembl
dr-xr--r--   1 user     group           0 May 11 14:34 NCBI
dr-xr--r--   1 user     group           0 May 24 11:09 UCSC
226 Transfer complete. 186 bytes transferred. 186 bps.
ftp> cd UCSC
250 Folder changed to "/Mus_musculus/UCSC".
ftp> ls
229 Entering Extended Passive Mode (|||56960|).
150 Opening ASCII mode data connection for file list.
dr-xr--r--   1 user     group           0 May 16 16:44 mm9
dr-xr--r--   1 user     group           0 Jun 14 17:03 mm10
226 Transfer complete. 121 bytes transferred. 121 bps.
ftp> cd mm9
250 Folder changed to "/Mus_musculus/UCSC/mm9".
ftp> ls
229 Entering Extended Passive Mode (|||56962|).
150 Opening ASCII mode data connection for file list.
-r--r--r--   1 user     group 15244063347 May 14 21:12 Mus_musculus_UCSC_mm9.tar.gz
226 Transfer complete. 85 bytes transferred. 85 bps.
ftp> get Mus_musculus_UCSC_mm9.tar.gz
local: Mus_musculus_UCSC_mm9.tar.gz remote: Mus_musculus_UCSC_mm9.tar.gz
229 Entering Extended Passive Mode (|||56986|).
150 Opening BINARY mode data connection for Mus_musculus_UCSC_mm9.tar.gz.
  1% |*                                                                                                                  |   180 MiB  590.63 KiB/s  6:54:51 ETA^Z

So the speed was around 500 KB/s 
However, much faster way is to use Filezilla.

Install Filezilla
Host: ussd-ftp.illumina.com
Username: igenome
Password: G3nom3s4u

Navigate to the local folder in left where you want to download.
Navigate to the folder where mm9 genome is (or the genome of your interest).

Right click, Download 

The speed was in a range 2-3 Mb/s!

2 comments:

  1. Great stuff. One point: Avoid Filezilla. Nothing wrong with it, but it's hosted on SourceForge, and we probably all know the problems there (bundled malware etc)>

    ReplyDelete
  2. illumina ftp does not support resuming broken downloads. it's bad news for my poor network.

    ReplyDelete