perl - Using curlmirror.pl gives different outputs -


using http://curl.haxx.se/programs/curlmirror.txt, i'm looking download website , check changes between newly downloaded website , 1 have downloaded previously. when download same website links on website use relative paths, use absolute paths, , counts "change" though website did not change.

usage: curlmirror.pl -l -d 3 -o someoutputfiledirectory/url http://url  output 1: <td><a href="testing.htm">link</a></td> output 2: <td><a href="http://mydomain.com/testing.htm">link</a></td> 

is there way convert relative paths absolute paths or other way around? need standardize download these links not appear "changes"

updated

i assume url placed $url variable. can try bellow:

perl -pe 'begin {$url="http://somedomain.org"} s!(\b(?:url|href)=")([^/]+)(")!$1$url/$2$3!gi' << xxx <td><a href="testing.htm">link</a></td> <td><a href="http://mydomain.com/testing.htm">link</a></td> <meta http-equiv="refresh" content="0;url="home"> xxx 

output:

<td><a href="http://mymain.org/testing.htm">link</a></td> <td><a href="http://mydomain.com/testing.htm">link</a></td> <meta http-equiv="refresh" content="0;url="http://mymain.org/home"> 

it replaces href="..." or url="..." (case-insensitive) patterns href="$url/..." or url="$url/..." if ... not contains / character.

if input file, can replace these patterns in file directly:

cat >tfile << xxx <td><a href="testing.htm">link</a></td> <td><a href="http://mydomain.com/testing.htm">link</a></td> <meta http-equiv="refresh" content="0;url="home"> xxx  cat tfile perl -i -pe 'begin {$url="http://mymain.org"} s!(\b(?:url|href)=")([^/]+)(")!$1$url/$2$3!gi' tfile echo "---" cat tfile 

output:

<td><a href="testing.htm">link</a></td> <td><a href="http://mydomain.com/testing.htm">link</a></td> <meta http-equiv="refresh" content="0;url="home"> --- <td><a href="http://mymain.org/testing.htm">link</a></td> <td><a href="http://mydomain.com/testing.htm">link</a></td> <meta http-equiv="refresh" content="0;url="http://mymain.org/home"> 

Comments

Popular posts from this blog

php - mySql Join with 4 tables -

css - Text drops down with smaller window -

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -