Tuesday, October 25, 2011

How to separate a column into multiple columns in Excel and LibreOffice Calc

One of the very useful features of Excel and LibreOffice Calc is the 'Text to column' feature.

In case you have a column that you want to separate e.g. the first column bellow:
chr21:33032286-33032287    rs17881180
chr21:33032856-33032857    rs114905802
chr21:33032987-33032988    rs6650814
chr21:33033000-33033001    rs4816405

and you want to make out of it three new columns
chr21    33032286    33032287    rs17881180
chr21    33032856    33032857    rs114905802
chr21    33032987    33032988    rs6650814
chr21    33033000    33033001    rs4816405

Select the first column
Go to Data/Text to column
Under Separated by type in the separators with the space between.

Before you do this make sure you have made two new columns in between, since the separation would overwrite already existing (2nd) column.

Monday, October 24, 2011

How to combine cells in Excel and LibreOffice Calc

In case you need to combine cells in Excel or LibreOffice Calc use the following formula:

E.g. in case you have three following cells:
chr14    71277391    71281827
and need to make one e.g.

use the following formula:


Thus, separate cell designations with the sign & except at the beginning and the end of the formula
and additional characters that you want to place between cell designations should be within " "

In this example, we used an example from the BED file and converted the chromosome name and genomic start and end locations into one string that can be used as an ID for that genomic position.

Wednesday, October 19, 2011

Is Omicsonline a scientific scam?

Yes it is. 

If you have published scientific papers you will most probably get an email like the one bellow. Email like this come from the Omicsonline group and they come from either @omicsonline.org or @omicsgroup.com addresses. They will either ask you to write a paper or attend a conference. In either case you will have to pay a fee.

Such conferences most probably don't exist, and the papers you can find on their website are not indexed in Pubmed. I suspect they are made up papers with made up authors or which is the worst case - authors don't even know they have published papers in these journals. It is possible that the names of scientists have been used without their knowledge and that they don't even know that they are the members of the editorial board, for example.

Create filter for such emails and send them to spam folder.

Dear Dr. Milos Pjanic

We hereby want to invite you to submit a paper for The Journal of Health & Medical Informatics for a special issue.

Special issue title: Bioinformatics

This special issue is being edited by:
Yixuan Wang
Albany State University

It would be grateful if you would submit a paper for this upcoming special issue on " Bioinformatics". Both Research and Review papers are welcome for possible publication in this issue. The deadline for submissions is 22nd November, 2011 and the target publishing date is 23rd, December 2011.
You may submit your paper by e-mail at editor.jhmi@omicsonline.org
or online at http://www.omicsonline.org/submission/
Please specify the title of the special issue in the subject if you submit by e-mail, or in the cover letter if you submit online.

Special features to articles published in The Journal of Health & Medical Informatics include:
1)Timely dissemination of your research work
2)Free PDF/Digital file of your published paper
3)No restriction for use/distribution
4)Create great looking digital files for distribution
5)Sharing you published work in social networking like Facebook, twitter, LinkedIn, RSS feeds, etc.
6)Translation of published paper to more than 50 languages
7)Nominal processing fee and reduction (i.e. 1/2) of general processing fee.

Benefits of the special issue
Contributors/authors: On special consideration, processing fee for special issue articles: $ 919

For more details on regular articles processing fee PS:http://www.omicsonline.org/Openaccesspublicationfee.pdf
Thank you for your time and consideration in this matter. We would appreciate it if you could let us know at your earliest convenience (but no longer than three weeks please) whether or not you will be submitting a paper. We are looking forward to hear from you soon!

With kind regards,
Yixuan Wang
Editor of Special Issue

Editorial office
OMICS Publishing Group
5716 Corsa Ave., Suite 110
Westlake, Los Angeles
CA 91362-7354, USA
E-mail: editor.jhmi@omicsonline.org
Phone: +1-650-268-9744
Fax: +1-650-618-1414
Toll free : +1-800-216-6499

How to search for enriched GO terms in the gene list

If you have a gene list that came out from your experiment and that is ranked for example by p-value there is an easy way to check for the enrichment of specific GO (gene ontology) terms.
Go to http://cbl-gorilla.cs.technion.ac.il/
Paste your gene list separated by /n
Chose the species

Chose running mode: Single ranked list of genes
In case you got as an output a ranked list of genes paste in total list of your genes in a ranked order (e.g. in you had 20.000 genes tested and ranked in your experiment, paste in all 20.000 genes in the order of ranking).

Chose an ontology: All
Select all 3 groups of ontologies to be shown as output.

The output is a table:

null Description P-value Enrichment (N, B, n, b) Genes
GO:0030695 GTPase regulator activity 5.45E-6 1.72 (13439,402,1552,80) [+] Show genes
GO:0060589 nucleoside-triphosphatase regulator activity 7.46E-6 1.70 (13439,412,1552,81) [+] Show genes
GO:0005488 binding 3.36E-5 1.07 (13439,9544,1560,1185) [+] Show genes
GO:0042578 phosphoric ester hydrolase activity 2.42E-4 1.83 (13439,311,1110,47) [+] Show genes
GO:0005509 calcium ion binding 3.77E-4 1.49 (13439,515,1591,91) [+] Show genes
GO:0005083 small GTPase regulator activity 4.58E-4 1.71 (13439,258,1552,51) [+] Show genes
GO:0005096 GTPase activator activity 4.66E-4 1.75 (13439,233,1552,47) [+] Show genes
GO:0005085 guanyl-nucleotide exchange factor activity 6.01E-4 1.91 (13439,161,1484,34) [+] Show genes
GO:0047555 3',5'-cyclic-GMP phosphodiesterase activity 6.25E-4 11.71 (13439,7,656,4) [+] Show genes
GO:0005515 protein binding 8.63E-4 1.12 (13439,5545,1458,672) [+] Show genes

In case you don't have a ranked list of genes, but just a group of genes that are came out of the experiment, select as running mode: Two unranked lists of genes (target and background lists).
In this case you will need to provide a background list of genes for the analysis. If for example your gene list comes from a microarray experiment the background should be all the genes from the array.

Thursday, October 13, 2011

How to remove duplicates in LibreOffice Calc

If you have a column or a table and you want to remove duplicate entries do the following (in LibreOffice Calc)

Select the range you want to filter
Data/Filter/Standard Filter
Name the column that you want to filter
Select a condition that is always true, like field1 = Not empty
Click on the button More Options, select Remove Duplicate, select Copy to and put the location of an empty cell
The whole selected range will be copied without duplicates at that new location.

Tuesday, October 11, 2011

How to sort columns by rows in Excel and Libreoffice Calc

Excel and Openoffice/Libreoffice Calc have a function sort that will sort rows by the column you select.

In case one wants e.g. to sort the table bellow by sorting columns according to the row 1:

NHD    T2D    NHD    T2D    NHD    T2D    NHD    T2D
7    3    4    7    0    14    8    5
17    12    9    14    13    10    18    13
3    0    3    3    0    2    2    0
1    0    0    0    0    0    0    1
2    0    0    0    1    0    0    0
1    1    1    3    4    2    0    0

In Libreoffice Calc:

Options/Direction/Left to right (sort columns)
Sort criteria/Sort by/select the row you want

The output is sorted table:

NHD    NHD    NHD    NHD    T2D    T2D    T2D    T2D
7    4    0    8    3    7    14    5
17    9    13    18    12    14    10    13
3    3    0    2    0    3    2    0
1    0    0    0    0    0    0    1
2    0    1    0    0    0    0    0
1    1    4    0    1    3    2    0

Monday, October 10, 2011

How to scan a 3'UTR of gene of interest for conserved miRNA homologies

Another great tool for miRNA analysis,


TargetScan is searching the 3'UTR region for the presence of conserved 8-mer and 7-mer sites that match the miRNA seed regions. As an option, nonconserved sites are also predicted.

The additional strength of such analysis is that by searching for the conservation among species we increase the chance of finding functionally relevant miRNA matches in the 3'UTR of a gene.

This is how the output of the TargetScanHuman looks like when NFIC human gene was scanned for conserved miRNAs.

The resulting output shows conserved (and poorly conserved) miRNAs that potentially regulate the 3'UTR region of the gene of interest. Bellow the sequence comparison of different species there is a list of conserved and poorly conserved miRNA matches. For the NFIC gene there are 3 conserved miRNA that could be accessed by clicking on their respective names under the gene depiction.

To view poorly conserved miRNAs, in addition to conserved ones, click on Show poorly conserved sites and sites for poorly conserved miRNA families. It is possible also to show only poorly conserved miRNAs by hiding the conserved miRNAs.

To download all results as a table click View table of miRNA sites.