Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
reg exp and replceing white space with [ SOLVED ]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
brent_weaver
Guru
Guru


Joined: 01 Jul 2004
Posts: 510
Location: Burlington, VT

PostPosted: Tue Dec 04, 2007 11:33 pm    Post subject: reg exp and replceing white space with [ SOLVED ] Reply with quote

Hello - I am using filenames in a scrip that have spaces in them. As a result I need to use a reg exp to repace all white spaces with a \ followed by a white space. For example:


/home/user/file name.ext
needs to be:
/home/user/file\ name.ext

How do I do this?
_________________
Brent Weaver


Last edited by brent_weaver on Thu Dec 06, 2007 3:53 pm; edited 1 time in total
Back to top
View user's profile Send private message
poly_poly-man
Advocate
Advocate


Joined: 06 Dec 2006
Posts: 2477
Location: RIT, NY, US

PostPosted: Wed Dec 05, 2007 12:27 am    Post subject: Reply with quote

What do you need this for?

Assuming, for example, that you are grabbing filenames from a 1-per-line list (no other ideas where this would be useful come to mind), something like

sed 's#\ #\\\ #'

(well, I'm not sure how many slashes you need, but you get the idea ;) )

poly-p man
_________________
iVBORw0KGgoAAAANSUhEUgAAA

avatar: new version of logo - see topic 838248. Potentially still a WiP.
Back to top
View user's profile Send private message
brent_weaver
Guru
Guru


Joined: 01 Jul 2004
Posts: 510
Location: Burlington, VT

PostPosted: Wed Dec 05, 2007 12:47 am    Post subject: Reply with quote

Thanks for the response. This does not work if there are multiple spaces in the filename. Any other advice. I would like to use a regular expression if possible.
_________________
Brent Weaver
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21490

PostPosted: Wed Dec 05, 2007 3:31 am    Post subject: Reply with quote

Instruct sed to replace everywhere in the expression by adding a g modifier after the last #. However, you may be better served by foregoing the attempt to escape the filename and instead quoting the variable properly, which will effectively escape all the special characters at once. Can you post an excerpt of the script which is processing these filenames?
Back to top
View user's profile Send private message
brent_weaver
Guru
Guru


Joined: 01 Jul 2004
Posts: 510
Location: Burlington, VT

PostPosted: Wed Dec 05, 2007 12:33 pm    Post subject: Reply with quote

Here are what some of the filenames look like:

/home/bweaver/pictures/'06_06_04_01
/home/bweaver/pictures/10 2004
/home/bweaver/pictures/10 2004/101MSDCF
/home/bweaver/pictures/10 2004/101MSDCF/DSC00825.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00829.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00830.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00851.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00930.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00931.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00932.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00936.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00938.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00940.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00943.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00944.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00945.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00946.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00947.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00948.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00950.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00951.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00953.JPG
/home/bweaver/pictures/10 2004/101MSDCF/DSC00954.JPG

The script is:

Code:

#!/usr/bin/perl

open(IN,"<infile.dat") or die "EXIT!";
while (<IN>) {
        system "/usr/bin/cksum $_";
        }

close IN;
exit;


My goal is the run a cksum on all the files in infile.dat to check for duplicates. So this will of course fail due to white spaces. I think that quoting is going to get messy sine the perl system call in already in quotes...

Thank you all for this GREAT help!
_________________
Brent Weaver
Back to top
View user's profile Send private message
Akkara
Bodhisattva
Bodhisattva


Joined: 28 Mar 2006
Posts: 6702
Location: &akkara

PostPosted: Wed Dec 05, 2007 1:12 pm    Post subject: Reply with quote

Here's an alternative way that bypasses the space problem which you might find helpful:
Code:
find . -type f -name "*.jpg" -exec md5sum {} \;


To find duplicates, just pass this to sort, and then uniq:
Code:
find . -type f -name "*.jpg" -exec md5sum {} \; | sort | uniq -D --check-chars=32
Back to top
View user's profile Send private message
brent_weaver
Guru
Guru


Joined: 01 Jul 2004
Posts: 510
Location: Burlington, VT

PostPosted: Wed Dec 05, 2007 1:20 pm    Post subject: Reply with quote

Hey thanks for the response. I still think that it cannot be that hard to do this replacement in a regexp. This code did not necessarly work for there are mixed case filenames and diff extensions.

Thanks!
_________________
Brent Weaver
Back to top
View user's profile Send private message
Akkara
Bodhisattva
Bodhisattva


Joined: 28 Mar 2006
Posts: 6702
Location: &akkara

PostPosted: Wed Dec 05, 2007 1:30 pm    Post subject: Reply with quote

Well, to just replace spaces, you can do this:
Code:
echo "one test  two  test   three" | sed 's: :\\ :g'


I'm not sure how that would carry over to perl however.

But there's other characters that might appear in filenames that also could cause problems. Quotes, for one, and, of course, backslashes, that also need to be escaped. Oh, and * characters, and '&', ';', '<', '>', and probably others I forgot. I think this should work:
Code:
echo "test'ing back\slash \"quote\" *&;<>" | sed 's:[ \&;<>'"'"'"\\]:\\&:g'


I vaguely recall perl (or was it python?) had some sort of quotify function that does this sort of thing.

It is probably easier to just put single-quotes around the whole filename -- after first escaping any single-quotes in it!:
Code:
echo "test'ing back\slash \"quote\" *&;<>" |  sed -e "s:':'\"'\"':g" -e "s:.*:'&':"
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21490

PostPosted: Thu Dec 06, 2007 3:37 am    Post subject: Reply with quote

brent_weaver wrote:
Hey thanks for the response. I still think that it cannot be that hard to do this replacement in a regexp. This code did not necessarly work for there are mixed case filenames and diff extensions.

Thanks!


Use -iname to make the pattern case insensitive. You can chain together a set of disjunctions if there is no one pattern that matches all the files. For example, find . -name '*.jpg' -o -name '*.png' -o -name '*.gif' -exec md5sum {} \; will match jpg, png, and gif files.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum