Selective Site Mirroring with wget

Since Cloudera doesn’t seem to support an rsync server for us to mirror against, I had to resort to using wget to mirror their CDH distribution. To save some time for those who are attempting to do something similar, here’s a wget one-liner to grab only the RPM’s (while maintaining the directory structure):


wget -N -r -nH -np --cut-dirs=3 -A rpm,xml http://archive.cloudera.com/redhat/cdh/3/

11. May 2011 by Jason
Categories: UNIX | Tags: | 2 comments

Comments (2)

  1. You may also want to look into reposync, part of the package yum-utils:
    reposync -c http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo –source -r cloudera-cdh3

    • That would have worked as well. I’ve had problems with reposync in the past, so I completed discounted it, but it would certainly do the trick in this situation!

Leave a Reply

Required fields are marked *

*