How to clone any website in the world and host it on a domain of your choice for Dummies
First things first - even though the title is clickbaity but the intention isn’t to help you plagiarise content. Having said that, there are some valid use cases where you can use this:
Disaster recovery - your actual website is getting DDOSed and your need a static version up and running temporarily hosted elsewhere (preferably behind Cloudfront which provides you with DDOS mitigation using AWS shield)
Backups of your blogs or content hosted on third party websites
Migrating away from third party providers. e.g. I have been thinking of migrating away from Tumblr and this can be useful if I plan to use a static site generator for my blog going forward
This works well for a static website and below are the steps to achieve it! I’ll show you how to clone this blog and host it on this subdomain: http://badd431a8703451fa409d73d2218b545.aawaara.com/
Hopefully this will not result in every other person having the same blog as mine ;)
STEP 1: Install wget and AWS CLI and set up your AWS credentials
STEP 2: Create a mirror of the website and re-write links
wget --mirror --convert-links --html-extension --wait=1 https://aawaara.com/
--wait specifies that there is a wait of 1 second after each request. Be gentle with someone else's server. Also you risk being blocked if your request rate is too high. Use -U option to spoof user agent to that of an actual browser in case the server is blocking you from making requests. This will take hours so be patient.
STEP 3: Create S3 bucket and enable "Static Website Hosting"
Create a S3 bucket with the same name as the domain (or subdomain) you want to host it on. e.g. In my case I need to create badd431a8703451fa409d73d2218b545.aawaara.com and need to allow uploading of public files (it’s a setting while creating the bucket). Then you need to enable static website hosting for you bucket and specify index document as index.html
Also note the endpoint here as you will need it later.
STEP 4: Upload files to S3 bucket using AWS CLI and make them public
Go to the directory where the files have been downloaded (cd aawaara.com) and execute the following command (for your bucket):
aws s3 sync . s3://badd431a8703451fa409d73d2218b545.aawaara.com/ --acl public-read
STEP 5: Add a CNAME entry in your DNS records which points to your S3 bucket
CNAME record for badd431a8703451fa409d73d2218b545.aawaara.com should point to badd431a8703451fa409d73d2218b545.aawaara.com.s3-website-us-east-1.amazonaws.com
Please note the value of CNAME record is the one you noted down at the end of step 3. In this case the CNAME record points to a subdomain, but you can very well point it to the root domain. For example you can create a bucket example.com and then the CNAME record for example.com needs to point to example.com.s3-website-us-east-1.amazonaws.com. You can find more details here.
That's it folks! Your clone should now be accessible on the domain of your choice!
via GIPHY
Reference: Gist
















