Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[Solved] sed help - converting html links
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
jasn
Guru
Guru


Joined: 05 May 2005
Posts: 412
Location: Maryland, US

PostPosted: Fri Jun 28, 2013 5:08 pm    Post subject: [Solved] sed help - converting html links Reply with quote

wget didn't seem to localize all the links in the html files, after downloading a website for me. So I have some html files in where I'd like to redirect the links to point to local versions of the files. So where there is the line;
Code:
href="http://www.yoyodyne.haha/index.html"

I would like to change it to;
Code:
href="index.html"

and where there is the following line;
Code:
src="http://www.yoyodyne.haha/Picture1.jpg"

I would like to change that to;
Code:
src="images/Picture1.jpg"

I bascially don't how to handle single quotation marks, equals signs, and periods, in the sed command, because I need to differentiate the;
Code:
src="

lines, from the
Code:
href="

lines. I tried different sed expressions and couldn't come up with anything that worked.

Any help would be appreciated.

Thanks..


Last edited by jasn on Fri Jun 28, 2013 7:29 pm; edited 1 time in total
Back to top
View user's profile Send private message
666threesixes666
Veteran
Veteran


Joined: 31 May 2011
Posts: 1223
Location: 42.68n 85.41w

PostPosted: Fri Jun 28, 2013 5:25 pm    Post subject: Reply with quote

there are regular expression examples to check email@site.abc.... modify them to http://*.abc then there is a method to delete things matching expressions, resulting in exactly what you want. id back up what ever your going to wreck. good luck

ref
http://www.unix.com/shell-programming-scripting/135367-using-sed-validate-e-mail-address-entry.html

it should be something along the lines of "sed -i -e (regexgarbage you must craft) index.html" (index.html being the file you want http://www.linkgarbage.com removed from)

http://www.cyberciti.biz/faq/howto-delete-word-using-sed-under-unix-linux-bsd-appleosx/
_________________
cat /etc/*-release
Funtoo Linux - baselayout 2.2.0
consider this warning no. 1
http://ecx.images-amazon.com/images/I/81Ku-vxIb3L._SL1500_.jpg
http://wiki.gentoo.org/wiki/Special:Contributions/666threesixes666
Back to top
View user's profile Send private message
jasn
Guru
Guru


Joined: 05 May 2005
Posts: 412
Location: Maryland, US

PostPosted: Fri Jun 28, 2013 7:26 pm    Post subject: Reply with quote

Thanks for the tips and links, but that's similar information to what I was reading earlier, and it didn't help me sort out how to handle the non-alpha characters in the sed command line. However, your response did prompt me to do some more online searches for sed examples, like this one, and give it another try. I think I have a working sed command now;

For the href links;
Code:
sed -i s,href=\"http:\/\/www.yoyodyne.haha/\,href=\", *.html

and for the image src links;
Code:
sed -i s,src=\"http:\/\/www.yoyodyne.haha,src=\"images, *.html


Anyone, please feel free to comment further, but I'll mark this solved.

Thanks again..
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum