Wednesday, October 19, 2016

[solved] linux join : “File 2 not in sorted order”

Solution:
using LANG=en_EN sort -k 1,1 <myfile> ... then LANG=en_EN join ...

Note:
Don't use -k 1d,1 for sorting strings such as gene names.
Always join using ENSEMBL IDs'

Cause:
sort -k 1d,1 and join probably have different default orders especially when dealing with strings containing special charaters, like '-'

No comments:

Post a Comment