spring.net — live bbs — text/plain
The SpringUnix › topic 20

global search and replace methods

topic 20 · 23 responses
~sprin5 Thu, Apr 27, 2000 (21:38) seed
There are many ways to do global search and replace, this tipic hoefully will present some that work using grep, sed and perl scripts.
~sprin5 Thu, Apr 27, 2000 (21:40) #1
I needed to do some global searching and replacing tonight so I created the following one line perl script: perl -pi~ -e "s/Sorry, we are still under construction!/Bob and Paul are working on this site/g;" `find . -name "*.htm"` I named it replace.pl and performed a chmod+x on it and voila. It worked like a champ. That's one way to do it that worked for me.
~MarciaH Thu, Apr 27, 2000 (22:04) #2
Fantastic. Next time you'd better post a translation with your comments. Wish I knew more about it, but there are just so many hours and all that...
~sprin5 Fri, Apr 28, 2000 (08:57) #3
I forgot to say I put it in the virtual_html directory.
~terry Mon, Jul 16, 2001 (10:36) #4
find . -type f | xargs grep -i hotjava will find all instances of the word hotjava on the whole system
~terry Thu, Aug 9, 2001 (09:47) #5
Search/Replace in many Files example of how to run a search and replace through many files in UNIX. Comes in handy for situations like when Netscape Composer changes all the links to absolute rather than relative. from the unix command prompt: type foreach file (*.html) where *.html is the search pattern there will be a new prompt. Type: cp $file $file.orig to backup the files mv $file xx which moves the old files into a 'temp' file sed '1,$s/search/replace/g' xx > $file where search and replace are your strings. Note: Special characters such as / should be preceded by a \ end Once you type 'end', it will execute these commands.
~terry Tue, Sep 18, 2001 (09:51) #6
Finding them all can be done in many, many ways, but here is one way, to seach every regular file on the machine... 1. become root 2. type: find / -type f | xargs grep -l localhost | Mail root &
~terry Sat, Apr 6, 2002 (18:51) #7
Perl offers a solution that reduces it from a three-step process (change/diff/move) to one: perl -i -wpe 's/$INPUT_TXT/$OUTPUT_TXT/g' $file
~terry Thu, Sep 4, 2003 (13:30) #8
I need to rename a whole directory worth of files from one extension to another. Is there a way to combine one of the above tricks with xargs or something to do that easily? Use a bourne type shell: for file in ; do mv $file ${file%.old}.new done in csh/tcsh, I think it would be something like this: foreach file ( pattern ) mv $file !#:1:r.new end In either case, watch out for the glob matching leading pathnames. Another method: for i in `ls *.foo` do mv $i `basename $i .foo`.bar done
~terry Thu, Oct 2, 2003 (23:17) #9
jkcunningham Is there a relatively easy way to search for a string recursively throughout a directory tree and replace it with another string? It could easily be restricted to files of a certain extension. Thanks. neo77777 Look at perl scripts, it is a powerfull text editing tool as well Mik I think the easiest way would be to create one simple script to replace a string. Something like: [CODE] #!/bin/bash if [ $# -lt 3 ] then echo "usage: replace " exit fi sed -e "s/$2/$3/g" $1 > $1.~bak mv $1.~bak $1 [/CODE] And then run one command to search through a directory and replace strings using the script above. Something like: find . -name "*.txt" -exec \replace {} "some string" "something" \; It's just a quick script I put together so I can't guarantee that it will work in all cases. But it should work for most files. If it's important data you are gonna be running it on you might want to change the mv into a cp to make sure you still have the backup file in case it goes wrong. unSpawn I use [url="http://www.laffeycomputer.com/rpl.html"]rpl[/url], easy and safe (simulation mode). jkcunningham Thanks. I'll try them both. source http://www.linuxquestions.org/questions/archive/1/2002/07/4/26349
~terry Mon, Mar 8, 2004 (13:03) #10
find ./ -type f -name "foo*" -print | sed 's:\(.*\)/foo\(.*\):mv "&" "\1\/bar\2":' |sh is another way
~terry Mon, Apr 19, 2004 (19:00) #11
Unix Tip 2002-04-24 16:02:31 How to do a global search and replace in unix for i in `egrep -lR "spurious dipthong" .`; do perl -i -pe "s/spurious dipthong/non-spurious dipthong/g" $i ; done The above will find "spurious dipthong" and replace it with "non-spurious dipthong". What's nice about this, is it's fast, and you can use any crazy ass perl regex you want. Also, you can search for files that contain "jay and silent bob" and then replace all occurances of "bitch" with "boo-boo-kitty-fuck", so as you can see it's pretty versatile. How it works It's a normal shell for loop. The for loop is recieving a list (via the magic back-tics `) from egrep... `egrep -lR "Lisette" .` That -l means, only return file names (and paths). The -R means, be recursive. And of course the period at the end means, start looking from this location. You can replace the period with a path (I think). perl -i -pe "s/is (cool|awesome)/is super $1/ig" $i Each matching filename is put in $i (one at a time) and passed to perl, which is in 'in place editing mode' with the -i flag. Then, perl does it's s///g magic on the file. Think of the fun. Don't forget to back stuff up before doing global search and replaces!
~terry Mon, Apr 19, 2004 (19:14) #12
Since it's damn near impossible to find online the simplest way to scan a Unix directory of files, search for one text pattern, and replace with another, I am now archiving the simplest method I could find (which I've tested and have proven that it works beautifully). Simply cd to the directory where your files live, modify (or leave) the *.php to match the file type you are modifying, then run the following at the command line: for fl in *.php; do mv $fl $fl.old sed 's/FINDSTRING/REPLACESTRING/g' $fl.old > $fl #rm -f $fl.old done Uncomment rm -f $fl.old if you don't want to bother keeping a copy of the old files. Simple, eh? It's all about sed, baby.
~terry Mon, Apr 19, 2004 (19:30) #13
#!/bin/sh if [ $# -lt 3 ] ; then echo -e "Wrong number of parameters." echo -e "Usage:" echo -e " renall 'filepat' findstring replacestring\n" exit 1 fi #echo $1 $2 $3 for i in `find . -name "$1" -exec grep -l "$2" {} \;` do mv "$i" "$i.sedsave" sed "s/$2/$3/g" "$i.sedsave" > "$i" echo $i #rm "$i.sedsave" done
~terry Mon, Apr 19, 2004 (19:32) #14
The code: #!/usr/local/bin/perl # # Usage: rename perlexpr [files] ($regexp = shift @ARGV) || die "Usage: rename perlexpr [filenames]\n"; if (!@ARGV) { @ARGV = ; chomp(@ARGV); } foreach $_ (@ARGV) { $old_name = $_; eval $regexp; die $@ if $@; rename($old_name, $_) unless $old_name eq $_; } exit(0); The Explanation Save the above code into a file called rename. Make sure that the permissions are set correctly so that you can execute the script. Also check to make sure that your Perl interpreter is in /usr/local/bin. If Perl is somewhere else, you'll need to change the first line to point to where Perl is installed on your system. To use the script you use: rename perlexpr [files] where perlexpr is the substitution operator, i.e., s///. You can actually pass any Perl expression through to perlexpr allowing you to do more complex file renaming actions. The files argument is a list of filenames that you want to change. You can leave the files argument out and the script will take a list of names from STDIN. The Examples Make all the files in the directory end with .html instead of .txt. rename 's/txt$/html/' * Change all the files prefixed with the text mah and suffixed with .new to be suffixed with .old instead. rename 's/new$/old/' mah*.new Hide every file in the directory by prefixing the filename with a . rename 's/(.+)/\.$1/' * The possibilities are endless. You should be careful however as you are dealing with regular expressions. You should be as specific as possible when specifying your patterns otherwise you may rename a file in a way that you had not anticipated. For instance, take the first example. If you had typed: rename 's/txt/html/' * (notice the missing $ in the pattern?) and you had a file named newtxt.txt, the script would rename the file to newhtml.txt which might not have been what you wanted. Hopefully this script will be useful to you. If you have any problems or questions, you can e-mail them to me at dmah@vox.org
~terry Mon, Apr 19, 2004 (19:49) #15
find . -name index.shtml -exec perl -pi.bak -e "s/string1/string2/g" {} \;
~terry Tue, Apr 20, 2004 (23:07) #16
Here's a *file* renaming utility I got from Jeff Monks. for x in `find . -name temp_index.htm` do dir=`dirname $x` mv $x $dir/index.html done Need to test it.
~terry Tue, Jan 31, 2006 (20:55) #17
Want to use sed(1) to edit a file in place? Well, to replace every 'e' with an 'o', in a file named 'foo', you can do: sed -i.bak s/e/o/g foo And you'll get a backup of the original in a file named 'foo.bak', but if you want no backup: sed -i '' s/e/o/g foo
~terry Wed, Mar 8, 2006 (08:18) #18
http://www.uwo.ca/its/doc/hdi/web/treesed.html#replacing treesed How to Use Treesed First you log in to panther.uwo.ca, and go to the directory where you want to search or make changes. There are two choices you can make when using treesed: 1. Do I just want to search for a text, or do I want to search for a text and replace it with something else? If you are just searching you are using Treesed in "search mode", otherwise it is in "replace mode." 2. Do I want to search/replace only in files in my current directory, or should files in all subdirectories (and all directories below that) also be done? Some examples will make this clear. Searching Say you are faced with the situation that the author of a slew of web-pages, Nathan Brazil, has left and has been succeeded by Mavra Chang. First, let us see which files are affected by this (what you type in is shown in bold): [10:52am panther] treesed "Nathan Brazil" -files *.html search_pattern: Nathan\ Brazil replacement_pattern: ** Search mode . midnight.html: 1 lines on: 2 .. well.html: 1 lines on: 3 We notice the following: * The search text "Nathan Brazil" is enclosed in double-quotes ("). * You specify which files to search with -files followed by a list of file names--in this case *.html. * Treesed reports the search pattern ("pattern" is just a fancy word for "text") you specified (you can ignore that \). * Treesed reports an empty replacement_pattern. This is correct, because you haven't entered one. * It therefore deduces that is is in search mode. * It finds two files containing "Nathan Brazil", and reports on which lines of these files it found it; it does not show the lines themselves. Because you used -files, Treesed will search in the files you specify in the current directory. You can also search files in the current directory and all directories below it. However, in that case you can not specify which file names to use, all files will be searched: [11:02am panther] treesed "Nathan Brazil" -tree search_pattern: Nathan\ Brazil replacement_pattern: ** Search mode . midnight.html: 1 lines on: 2 ... well.html: 1 lines on: 3 . new/echoes.html: 1 lines on: 2 We notice the following: * Instead of -files we now see -tree. * We do not see a specification of file names. * Treesed finds an occurence of "Nathan Brazil" in the file echoes.html in the subdirectory new; it did not find this file in the previous example (as it shouldn't). Replacing To replace a text you simply add the replacement text right after the search text: [11:17am panther] treesed "Nathan Brazil" "Mavra Change" -files *.html search_pattern: Nathan\ Brazil replacement_pattern: Mavra Chang ** EDIT MODE! . midnight.html: 1 lines on: 2 Replaced Nathan\ Brazil by Mavra Chang on 1 lines in midnight.html .. well.html: 1 lines on: 3 Replaced Nathan\ Brazil by Mavra Chang on 1 lines in well.html We notice the following: * Right after the search text "Nathan Brazil" you specify the replacement text "Mavra Chang". * As a result, Treesed now reports a non-empty replacement_pattern. * Hence it concludes it is in "edit mode", which means replacment mode. * Treesed dutifully reports on which lines in which files it did the replacement. To replace a text in all files in the current directory and the ones below it, we do the following: [11:17am panther] treesed "Nathan Brazil" "Mavra Chang" -tree search_pattern: Nathan\ Brazil replacement_pattern: Mavra Chang ** EDIT MODE! . midnight.html: 1 lines on: 2 Replaced Nathan\ Brazil by Mavra Chang on 1 lines in midnight.html .... well.html: 1 lines on: 3 Replaced Nathan\ Brazil by Mavra Chang on 1 lines in well.html . new/echoes.html: 1 lines on: 2 Replaced Nathan\ Brazil by Mavra Chang on 1 lines in new/echoes.html and we get the expected results, including the replace in new/echoes.html. Old Versions Treesed leaves behind quite a mess of old versions of the files it changed (only in change-mode, of course). These old files have the same name as the original file, with .ddddd appended to it. For example, if treesed makes a change to midnight.html it will leave the original version as something like midnight.html.26299. You'll have to remove these files lest your disk area clutters up. Here is a command that does that, but beware! This command removes all files in the current directory and all below it, that end in a period followed by one or more digits: find . -name "*.[0-9]*" -exec rm {} \; It is interesting to note that if you use treesed again without cleaning up, you may get files like midnight.html.26299.27654. These will also be cleaned up by the above slightly dangerous command. About Treesed treesed is public domain software developed and designed by Rick Jansen from Sara, Amsterdam, Netherlands, January 1996.
~terry Wed, Mar 8, 2006 (08:28) #19
download treesed http://fresh.t-systems-sfr.com/cgi-bin/warex?unix/src/misc/treesed.Z http://fresh.t-systems-sfr.com/cgi-bin/warex?unix/src/misc/treesed.gz http://fresh.t-systems-sfr.com/cgi-bin/warex?unix/src/misc/treesed.bz2 http://fresh.t-systems-sfr.com/cgi-bin/warex?unix/src/misc/treesed.zip http://fresh.t-systems-sfr.com/unix/src/misc/.warix/treesed.html
~terry Wed, Mar 8, 2006 (09:14) #20
http://www.webmasterworld.com/forum46/495-1-10.htm has some options
~terry Wed, Mar 8, 2006 (09:49) #21
http://www.laffeycomputer.com/rpl.html rpl - Replace Strings - from Laffey Computer Imaging Price: $0 (Copyrighted FreeWare) Current Version: 1.4.0 Date Modified: July 22, 2002 Featured as Tool of the Month on UnixReview! Overview rpl is a UN*X text replacement utility. It will replace strings with new strings in multiple text files. It can work recursively over directories and supports limiting the search to specific file suffixes. rpl [-iwRspfdtx [-q|-v]] Details rpl replaces old_str with new_str in all target files. It returns the number of strings replaced or a system error code (non-zero) if there is an error. Note that you should put strings in single quotes if they contain spaces. You must also escape all shell meta-characters. It's a good idea to put ALL strings in single quotes. If one of the strings starts with a "-" you need put "--" as the last argument BEFORE the string. This will prevent the options parser from treating the string as a command- line option. For Example: rpl -i -- '-8x' '+8x' myfile which would replace occurences of "-8x" with "+8x" in the file myfile (ignoring case). A period will be printed to stderr as each target file is processed to give you feedback on the replacement progress. You may use the quiet (-q) option to suppress all output but major error reporting. rpl will attempt to maintain the owner, group and permissions of your original files. For safety, rpl creates a temporary file and makes changes to that file. It then moves the temporary file over the original file. rpl sets the owner, group, and permissions of the new file to match those of the original file. In some circumstances rpl will not be able to do this (such as when a file is owned by the superuser but you have group write permission). In these cases rpl will warn you that the owner/group or permissions cannot be set and that file will be skipped, unless you use the force (-f) option. Note that the use of temp files in predictable, world-writeable locations could lead to symlink attacks. Ideally you should set the $TMPDIR environment variable to a private directory readable and writeable only by you. This is especially important if running rpl as root. You have been warned! rpl can be placed in silumation mode (-s), in which rpl will print a list of files that would be modified if an actual replace operation were executed. This is useful when you are about to make changes to a larger group of files, possibly in many directories. rpl can be placed into prompt mode (-p). In this mode rpl will examine each file, printing a period as each file is scanned. If a match is found rpl will prompt you to save the replacements made to that file. Answering "y", or pressing Return will save the changes. Answering "n" will leave that file untouched. rpl will then move on to the remaining target files. Note that you will only be prompted for files which had a match. If no match is found a period is printed to give you an indication that rpl is working. (This is useful when, for instance, you are performing a large recursive batch replacement on a collection of files.) Normally, rpl will change the modification time of all files it processes like any other program. However, you may instruct rpl to keep the original modification times using the -d (Don't alter mod-times) option. You can specify file suffixes to be searched using the -x option. Any files that do not match the specified suffixes will not be searched or modified. The -x option may be used more than once to tell rpl to search files with varying suffixes. For instance, say you wanted to search all of your ".html", ".htm", and ".php" files you would add " -x'.html' -x'.htm' -x'.php' " to your command line. rpl would then skip any files that did not end with these suffixes. This is mainly useful when doing recursive searching (-R option). OPTIONS -i Ignore case of old_str rpl will match the old_str in the searched file regardless of the case. The case of new_str will not be altered. -w Whole words (old_str bounded by white space in file) rpl will only match old_str if it is bounded by the start of a line, a space, a tab, or the end of a line. -q Quiet mode (no output at all) Good for shell scripts, etc. -v Verbose mode (lots of output) rpl will list the name of each file and directory, and the line numbers that contain matches. -R Search directories recursively rpl will scan every file and every directory recur- sively. Without this option directories will be skipped. -x Specify file suffixes to search. (e.g. ".html", ".c", etc.) May be used multiple times. See above for details. -p Prompt for each file rpl will prompt you before scanning each file. If you respond 'N' or 'n' rpl will skip that file and move on to the next file. The default action if you press enter is to process the file. -s Simulation mode rpl will scan all of the files and list the names of files that it would modify if a replace opera- tion was executed. If you turn on the verbose (-v) option as well rpl will list the line numbers where the string was matched. -e Honor Escapes rpl will honor escape sequences in old_string and new_string. Standard escapes such as "\t" (tab), "\n" (newline), "\r" (carriage return) are processed, as well as any octal or hexidecimal ASCII codes. Octal ASCII codes start with a '\' and are comprised of three digits [0-7] (e.g. '\015'). Hexidecimal ASCII codes start with '\0x' followed by two characters [0-f] (e.g. '\0x0d'). The 'x' and the [a-f] may be upper or lowercase. When you use this switch you must escape all backslash ('\') characters with another backslash (e.g. '\\'). -f Force mode rpl will overwrite files even if the owner, group, or permissions of the new file will not match the original. Obviously, rpl cannot overwrite files if the user does not have write permission. -d Don't change modification times rpl will process files, but keep their original modification times. -t Use $TMPDIR for temporary files Causes rpl to write temporary files to the direc- tory specified by the environment variable $TMPDIR instead of writing the temp files to the original file dir. -L Display the software license This displays the software license that you agree to by using rpl. -h Display a brief summary of options
~terry Wed, Mar 8, 2006 (10:02) #22
Tool of the Month: rpl by Joe "Zonker" Brockmeier This month, I'll introduce a tool that is handy for admins, programmers, and anybody who works with text files on a regular basis. The utility is rpl, short for "replace strings", which is exactly what it does. rpl is a simple utility that searches files for a text string and replaces that text string with another that you specify. Replace Strings with rpl It's possible to replace text strings in multiple files with numerous *nix utilities, but that involves getting to know a programming language or the arcane syntax of sed, awk, or some other program. While I'm a big fan of sed and Perl, for example, there's also something to be said for a utility that allows users to become productive in a matter of minutes rather than learning a programming language. That's where rpl comes in. Is it as powerful as sed or Perl? Nope. But it is a quick and easy way to make changes in text files � and it shouldn't take more than a few minutes to learn. The basic syntax of rpl is rpl 'oldtext' 'newtext' filename. It doesn't really get much simpler than that, now does it? Note that strings should be placed inside single quotes (') so that the shell doesn't try to treat part of your text string as a special character. If you're replacing a single word with another single word (in other words, no white space) then it's not mandatory to place your strings inside single quotes � but it's a good habit to get into. There are also several options that may be of interest when using rpl. Let's say you want to replace all instances of the string "Copyright 2003-2004" with "Copyright 2003-2005" in all files with the extensions ".php," ".php3" or ".html" in the directory public_html and all of its subdirectories. It's a simple task using rpl: rpl -R -x .php -x .html -x .php3 'Copyright 2003-2004' 'Copyright 2003-2005' * The -R option tells rpl to look for the term recursively. Note that the -x option is used multiple times rather than using the option once and specifying several extensions afterwards. Even though you're passing the "*" wildcard to the shell, rpl will only work on files with one of the extensions specified. Very often, it's necessary to replace a string that may also be present within a larger string. For example, if you wanted to replace the word "write" in a set of files with another string, you might not want to insert the string into words like "rewrite," "written," and so forth. To tell rpl to ignore a string that is bounded by whitespace, use the -w option: rpl -w 'write' 'replace' * What if you're not sure which files contain a term? The -p option will cause rpl to prompt you for each file that will be changed. Note that rpl will not prompt for each change, but only for each file that will be changed. A file might have only one change, or several hundred. To find out ahead of time what files will be changed, use the -s "simulation" option. This will cause rpl to search out files that will be changed, and provide a list of files that will be changed, but no changes will be made on that pass. If you'd like to make changes to files without changing their modification time, use the -d option. Like most *nix utilities, rpl is case-sensitive by default. If you'd like to match instances of a string regardless of case, use the -i option. When using rpl -i, specifying "abc" as the old string will match "abc," "ABC," "aBc", and so forth. This can be particularly handy when replacing filenames in HTML files produced by users working on operating systems that are not case-sensitive. There are a few other useful options with rpl. Be sure to check the man page for rpl, and test it out a bit. It's not quite as powerful as using sed or Perl, but it's a nice tool when you're doing simple search and replace operations. Getting rpl The rpl utility is a freebie from Laffey Computer Imaging. Source and binaries are available from the site. For Debian users, rpl is just an apt-get away. from http://www.unixreview.com/documents/s=8989/ur0407h/
~terry Wed, Mar 8, 2006 (10:35) #23
So far, good. rpl seems to be the answer I've been looking for to do global search and replace on unix files. I just replaced bank@spring.net with banking@wholetech.com for the address for donations to the Spring. The jury's still out. Let's see if this works.
log in or sign up to reply to this thread.