MyTetra Share
Делитесь знаниями!
Перемешать строки в файле случайным образом
Время создания: 02.02.2017 17:38
Текстовые метки: linux file shuffle
Раздел: Linux

8down votefavorite


I want to shuffle a large file with millions of lines of strings in Linux. I tried 'sort -R' But it is very slow (takes like 50 mins for a 16M big file). Is there a faster utility that I can use in the place of it?

linux bash unix

shareimprove this question

asked Feb 6 '13 at 10:48





Shuf? – Anders Lindahl Feb 6 '13 at 10:51



millions of lines for a 16MB file: you have very short lines? BTW: 16 MB is not big. It will fit in core, and sorting will take less than a second, I guess. – wildplasser Feb 6 '13 at 10:56 



@AndersLindahl : What's the entropy Shuf introduces? Is it as random as 'sort -R' – alpha_cod Feb 6 '13 at 11:05



@wildplasser : Oh...its a 16 Million line file, not 16 MB. Sorting is quite fast on this file, but 'sort -R' is very slow. – alpha_cod Feb 6 '13 at 11:05



@alpha_cod: I would guess it's /dev/random. You can control then entropy source with --random-source. – Anders Lindahl Feb 6 '13 at 11:33



This is a similar thread… – Ifthikhan Feb 6 '13 at 12:10



@AndersLindahl How about suggesting that as an answer? – that other guy Feb 6 '13 at 20:02

add a comment

3 Answers


up vote11down vote

Use shuf instead of sort -R (man page).

The slowness of sort -R is probably due to it hashing every lineshuf just does a random permutation so it doesn't have that problem.

(This was suggested in a comment but for some reason not written as an answer by anyone)

Так же в этом разделе:
MyTetra Share v.0.53
Яндекс индекс цитирования