On my Linux desktop I have a UTF-8 locale. When I try to search some KOI8-R encoded files with grep (ack), it fails. If I manually encode the pattern into KOI8-R and pass that as an argument, it works.
Is it possible to tell grep what encoding to use for the pattern? Or any other tool?
If all the files you're searching in have the same encoding:
LC_CTYPE=ru_RU.KOI8-R luit ack-grep "$(echo 'привет' | iconv -t KOI8-R)" *.txt
or in bash or zsh
LC_CTYPE=ru_RU.KOI8-R luit ack-grep "$(iconv -t KOI8-R <<<'привет')" *.txt
Or start a child shell in the desired encoding:
$ LC_CTYPE=ru_RU.KOI8-R luit
$ ack-grep 'привет' *.txt
$ exit
Luit (shipped with XFree86 and X.org) runs the program specified on its command line in the locale specified by the LC_CTYPE setting, assuming an UTF-8 terminal. So the command runs in the desired locale, and Luit translates its terminal output to UTF-8.
Another approach, if you have a directory tree with a lot of files in a different encoding, is to mount a view of that directory tree under a your prefered encoding. I think the fuseflt filesystem can do this (untested).
mkdir /utf8-view
fuseflt iconv-koi8r-utf8.conf /some/dir /utf8-view
ack-grep 'привет' /utf8-view/*.txt.utf8
fusermount -u /utf8-view
where the configuration file iconv-koi8r-utf8.conf contains
ext_in =
ext_out = *.utf8
flt_in =
flt_out = .utf8
flt_cmd = iconv -f KOI8-R -t UTF-8