[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: broken links on linuxdoc.org



GNU wget will do this for you.  Here are the options 

 wget --spider --force-html -i Howto-Howto.html

>From the man page:
        --spider
              When invoked with this option, Wget will behave  as
              a  Web "spider", which means that it will not down­
              load the pages, just check  that  they  are  there.
              You can use it to check your bookmarks...
    
       -F --force-html
              When input is read from a  file,  force  it  to  be
              HTML.  This  enables you to retrieve relative links
              from existing HTML files on  your  local  disk,  by
              adding  <base href="URL"> to HTML, or using --base.     

        -i filename --input-file=filename
              Read  URL-s  from  filename, in which case no URL-s
              need to be on the command line. If there are  URL-s
              both  on  the command line and in a filename, those
              on the command line are first to be retrieved.  The
              filename  need not be an HTML document (but no harm
              if it is) - it is enough  if  the  URL-s  are  just
              listed sequentially.

              However,  if you specify --force-html, the document
              will be regarded as HTML. In that case you may have
              problems  with  relative links, which you can solve
              either by adding <base href="url"> to the  document
              or by specifying --base=url on the command-line. 

I guess as long as the links are absolute links it should be just fine :-)

Jesse 


On Thu, 02 Nov 2000, David Merrill wrote:
> Greg Ferguson wrote:
> > 
> > On Nov 1, 11:05pm, Gerald Oskoboiny wrote:
> > > Subject: broken links on linuxdoc.org
> > 
> > This has been corrected. If anyone finds or know of other broken
> > links (or potential broken links) such as this, let us know so
> > we can put the redirects in place.
> 
> It wouldn't be too hard to write a script to verify URLs in the HTML
> versions posted online, right? I know I wrote such a script for my
> employer last year. Unfortunately, I no longer have it. :(
> 
> This would catch these types of errors before they become a problem for
> users.
> 
> -- 
> David C. Merrill, Ph.D.
> Linux Documentation Project
> Collection Editor & Coordinator
> www.LinuxDoc.org
> 
> 
> --  
> To UNSUBSCRIBE, email to ldp-discuss-request@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
-- 
Got freedom?
http://www.debian.org


--  
To UNSUBSCRIBE, email to ldp-discuss-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org