View previous topic :: View next topic |
Author |
Message |
brent_weaver Guru
Joined: 01 Jul 2004 Posts: 510 Location: Burlington, VT
|
Posted: Tue Dec 04, 2007 11:33 pm Post subject: reg exp and replceing white space with [ SOLVED ] |
|
|
Hello - I am using filenames in a scrip that have spaces in them. As a result I need to use a reg exp to repace all white spaces with a \ followed by a white space. For example:
/home/user/file name.ext
needs to be:
/home/user/file\ name.ext
How do I do this? _________________ Brent Weaver
Last edited by brent_weaver on Thu Dec 06, 2007 3:53 pm; edited 1 time in total |
|
Back to top |
|
|
poly_poly-man Advocate
Joined: 06 Dec 2006 Posts: 2477 Location: RIT, NY, US
|
Posted: Wed Dec 05, 2007 12:27 am Post subject: |
|
|
What do you need this for?
Assuming, for example, that you are grabbing filenames from a 1-per-line list (no other ideas where this would be useful come to mind), something like
sed 's#\ #\\\ #'
(well, I'm not sure how many slashes you need, but you get the idea )
poly-p man _________________ iVBORw0KGgoAAAANSUhEUgAAA
avatar: new version of logo - see topic 838248. Potentially still a WiP. |
|
Back to top |
|
|
brent_weaver Guru
Joined: 01 Jul 2004 Posts: 510 Location: Burlington, VT
|
Posted: Wed Dec 05, 2007 12:47 am Post subject: |
|
|
Thanks for the response. This does not work if there are multiple spaces in the filename. Any other advice. I would like to use a regular expression if possible. _________________ Brent Weaver |
|
Back to top |
|
|
Hu Moderator
Joined: 06 Mar 2007 Posts: 21490
|
Posted: Wed Dec 05, 2007 3:31 am Post subject: |
|
|
Instruct sed to replace everywhere in the expression by adding a g modifier after the last #. However, you may be better served by foregoing the attempt to escape the filename and instead quoting the variable properly, which will effectively escape all the special characters at once. Can you post an excerpt of the script which is processing these filenames? |
|
Back to top |
|
|
brent_weaver Guru
Joined: 01 Jul 2004 Posts: 510 Location: Burlington, VT
|
Posted: Wed Dec 05, 2007 12:33 pm Post subject: |
|
|
Here are what some of the filenames look like:
/home/bweaver/pictures/'06_06_04_01
/home/bweaver/pictures/10 2004
/home/bweaver/pictures/10 2004/101MSDCF
/home/bweaver/pictures/10 2004/101MSDCF/DSC00825.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00829.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00830.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00851.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00930.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00931.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00932.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00936.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00938.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00940.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00943.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00944.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00945.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00946.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00947.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00948.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00950.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00951.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00953.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00954.JPG
The script is:
Code: |
#!/usr/bin/perl
open(IN,"<infile.dat") or die "EXIT!";
while (<IN>) {
system "/usr/bin/cksum $_";
}
close IN;
exit;
|
My goal is the run a cksum on all the files in infile.dat to check for duplicates. So this will of course fail due to white spaces. I think that quoting is going to get messy sine the perl system call in already in quotes...
Thank you all for this GREAT help! _________________ Brent Weaver |
|
Back to top |
|
|
Akkara Bodhisattva
Joined: 28 Mar 2006 Posts: 6702 Location: &akkara
|
Posted: Wed Dec 05, 2007 1:12 pm Post subject: |
|
|
Here's an alternative way that bypasses the space problem which you might find helpful: Code: | find . -type f -name "*.jpg" -exec md5sum {} \; |
To find duplicates, just pass this to sort, and then uniq: Code: | find . -type f -name "*.jpg" -exec md5sum {} \; | sort | uniq -D --check-chars=32 |
|
|
Back to top |
|
|
brent_weaver Guru
Joined: 01 Jul 2004 Posts: 510 Location: Burlington, VT
|
Posted: Wed Dec 05, 2007 1:20 pm Post subject: |
|
|
Hey thanks for the response. I still think that it cannot be that hard to do this replacement in a regexp. This code did not necessarly work for there are mixed case filenames and diff extensions.
Thanks! _________________ Brent Weaver |
|
Back to top |
|
|
Akkara Bodhisattva
Joined: 28 Mar 2006 Posts: 6702 Location: &akkara
|
Posted: Wed Dec 05, 2007 1:30 pm Post subject: |
|
|
Well, to just replace spaces, you can do this: Code: | echo "one test two test three" | sed 's: :\\ :g' |
I'm not sure how that would carry over to perl however.
But there's other characters that might appear in filenames that also could cause problems. Quotes, for one, and, of course, backslashes, that also need to be escaped. Oh, and * characters, and '&', ';', '<', '>', and probably others I forgot. I think this should work: Code: | echo "test'ing back\slash \"quote\" *&;<>" | sed 's:[ \&;<>'"'"'"\\]:\\&:g' |
I vaguely recall perl (or was it python?) had some sort of quotify function that does this sort of thing.
It is probably easier to just put single-quotes around the whole filename -- after first escaping any single-quotes in it!: Code: | echo "test'ing back\slash \"quote\" *&;<>" | sed -e "s:':'\"'\"':g" -e "s:.*:'&':" |
|
|
Back to top |
|
|
Hu Moderator
Joined: 06 Mar 2007 Posts: 21490
|
Posted: Thu Dec 06, 2007 3:37 am Post subject: |
|
|
brent_weaver wrote: | Hey thanks for the response. I still think that it cannot be that hard to do this replacement in a regexp. This code did not necessarly work for there are mixed case filenames and diff extensions.
Thanks! |
Use -iname to make the pattern case insensitive. You can chain together a set of disjunctions if there is no one pattern that matches all the files. For example, find . -name '*.jpg' -o -name '*.png' -o -name '*.gif' -exec md5sum {} \; will match jpg, png, and gif files. |
|
Back to top |
|
|
|