Top Posts Tagged with #crawlingtest

How to do a crawling test using wget

At work, before the deployment of any major version of our site, we do a crawling test in order to check that all links work as expected. There are some applications that can help us to do it easily, like KLinkStatus, but in this post I will explain how to check all links from a website just using wget. It is as simple as running the following command:

wget -r -S http://yoursite.com 2>&1 | tee /tmp/crawlingTest

The -r parameter turns on the recursive retrieving and the -S prints the headers sent by HTTP servers. This command also logs all in the file "/tmp/crawlingTest" so you can filter in real time to see Internal Error Pages (500), Not Founds (404), Gones (410) or whatever you want. E.g.:

tail -f /tmp/crawlingTest | grep "HTTP/1.*500" # see Internal Error Pages (500)

But, what if you need authentication? You can also use recursive wget in pages with basic authentication adding the params --user and --password. E.g.:

wget -r -S --user=johndoe --password=mypassword yoursite.com 2>&1 | tee /tmp/crawlingTest

Why use wget instead of an specific link checker software? With wget you don't need to download, install and configure any new software; it is really simple to use; you can run it just from the command line, so you don't need a window system; and, it consumes less resources than most of the other apps designed to do so.

For more information, take a look at the GNU Wget Manual and this thread in StackOverflow.

#CrawlingTest #Linkchecker #SEO #webdev

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Top Posts Tagged with #crawlingtest | Tumlook

Trending Tags

Last Seen Tags

#crawlingtest

Trending Tags

Last Seen Tags

#crawlingtest