MyTetra Share
Делитесь знаниями!
Перемешать строки в файле случайным образом
Время создания: 02.02.2017 17:38
Текстовые метки: linux file shuffle
Раздел: Linux
Запись: Velonski/mytetra-database/master/base/1486039127ejw896urcj/text.html на

8down votefavorite


I want to shuffle a large file with millions of lines of strings in Linux. I tried 'sort -R' But it is very slow (takes like 50 mins for a 16M big file). Is there a faster utility that I can use in the place of it?

linux  bash  unix

s hareimprove this question

asked Feb 6 '13 at 10:48





Shuf?  – Anders Lindahl  Feb 6 '13 at 10:51



millions of lines for a 16MB file: you have very short lines? BTW: 16 MB is not big. It will fit in core, and sorting will take less than a second, I guess. – wildplasser  Feb 6 '13 at 10:56  



@AndersLindahl : What's the entropy Shuf introduces? Is it as random as 'sort -R' – alpha_cod  Feb 6 '13 at 11:05



@wildplasser : Oh...its a 16 Million line file, not 16 MB. Sorting is quite fast on this file, but 'sort -R' is very slow. – alpha_cod  Feb 6 '13 at 11:05



@alpha_cod: I would guess it's /dev/random. You can control then entropy source with --random-source. – Anders Lindahl  Feb 6 '13 at 11:33



This is a similar thread…  – Ifthikhan  Feb 6 '13 at 12:10



@AndersLindahl How about suggesting that as an answer? – that other guy  Feb 6 '13 at 20:02

add a comment

3 Answers

a ctiveoldest votes

up vote11down vote

Use shuf instead of sort -R (man page ).

The slowness of sort -R is probably due to it hashing every line shuf just does a random permutation so it doesn't have that problem.

(This was suggested in a comment but for some reason not written as an answer by anyone)

Так же в этом разделе:
MyTetra Share v.0.59
Яндекс индекс цитирования