This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Wget ignores robot.txt entry


> Max,
>
> No, I don't think cURL does recursive retrieval. I don't think it does
> Web page dependency retrieval, either. Both of these are a big deal for
> me. How could a tool of wget's versatility be replaced by something
> inferior? Whatever happened to technological meritocracy? (Please, no
> laughing.)
>
> I was actually hoping to get some time to work on an extension to wget
> of my own. I wanted to add an option that would cause wget to look in
> one hierarchy to determine file existence and modification times
> relative to the set of files and mod times on the server and download
> new or newer files to a different location. That way I can easily
> maintain mirror copies on a CD-ROM. I'd tell wget to use the CD's
> contents as the file and mod-time reference and to download to a
> location on my hard drive (of course). Then I could incrementally
> update the ROM with whatever was downloaded.

That's a real good idea! :-)

> Of course I can still do that and I may yet. Does that sound like a
> desirable feature to anyone? I don't know how many people share my
> mania for keeping local archives of content from the Internet.

I seem to end-up doing this quite a lot when on a hunt for new concepts and
ways of doing things. A 'uge web suck, most stuff I never glance a quarter
of my eye over, and I got a whole archive of stuff where i can just grep out
the crap.

> What happens to an open source project when it devolves to this state?
> Who, for example, could hand out writable access to the wget CVS
> repository? Surely this isn't an unrecoverable state of affairs, is it?
>
> Randall Schulz

Wasn't a patch applied to CVS HEAD of the wget repos only a few weeks ago.
Thats what it looks like anyway.


Regards,

Elfyn McBratney
elfyn@exposure.org.uk
www.exposure.org.uk



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]