365 days of code @365xcode - Tumblr Blog

Jan 24: Scrapy - effectively

To use scrapy effectively it all boils down to reading this tutorial -

http://doc.scrapy.org/en/latest/intro/tutorial.html. Gist of it - -Directory/ project files structure -Scrapy configuration file -spider file

Once you have these bare minimums ready, now comes the XPATH fun. So the deal is that scrapy allows you to specify XPATH selectors to parse the page effectively. Some of them I found useful -

//text() - selects all text. Whilst /text() only selects the top level element without children.

.selector('..') - selects parent of the found node. @href - for anchor links and .selector(//div[contains(@class,'jaja')] - all divs with class jaja.

So you got the drift on that one.

Now a handy feature to explore the effectiveness of scrapy is the scrapy shell. Use scrapy shell 'URL' to launch the shell and 'shelp()' to know more how to traverse around it. You can experiment all your selectors over there.

Happy scraping

#scrape #scrapy #python #crawler

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Jan 12: Heroku reference

So you want to use heroku. Good. I use it for a few purposes, mostly to try out a few hours of testing or the likes.

If you're like me you want to manage multiple accounts - use heroku accounts. Awesome add on https://github.com/ddollar/heroku-accounts.

To set up a new account simply heroku accounts:add <name> --auto. Creates the account, keypair, lists it in th gitconfig and you're ready to roll.

Then comes the management. Its fairly easy to push a branch to heroku - git push heroku master - pushes your master branch to heroku. In case you want to push something else - git push heroku <branch>:master works perfectly.

These tidbits posted here to avoid future fuckups.

#heroku #git

jan 11: Blacklist that bad bad driver

So there is this driver that has been troubling you ? echo "blacklist drivername" | sudo tee -a /etc/modprobe.d/blacklist-broadcom-wireless.conf

sudo update-initramfs -u

Jan 14: So i missed the AWS party !

I may be a late comer but better late than never.

So while configuring the deployment of one of my 'clouds' i wanted to make an AMI. And I didn't know what it was (though i knew what the full form was :D)

So AMI is Amazon Machine Image. It is a reference to your EBS storage and Instance type (micro, small etc). So when we create an image - it actually just creates a snapshot of the EBS volume and notes the type of instance. Thus the AMI is registered and an ID is provided.

This ID can be punched in EB and thus when scaling triggers, the same AMI is used to create new instances.

#aws #elasticbeanstalk #AMI

Jan13: restarting httpd and not losing env variables

Every time you restart httpd, it loses the env vars. I found this when I was in the middle of a deployment in Elastic beanstalk service of AWS.

Use apachectl ! it restarts and plays around with httpd PID and also resets the env vars for the process.

Some reference (though it doesn't mention apachectl): http://drumcoder.co.uk/blog/2010/nov/12/apache-environment-variables-and-mod_wsgi/

#httpd #aws #beanstalk

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Jan 12: more wifi troubles

Lubuntu and wifi drivers are giving me a tough time. Having a netbook HP mini 110 run on a lightweight linux distro is quite a challenging (and fun) task.

Back to wifi. Yeah, so there seem to be many drivers for this chipset Broadcom BCM4312. It turns out that B43 works best.

So you need to unload the wl module

sudo modprobe -r b43 ssb wl brcmfmac brcmsmac bcma rmmod -f wl sudo modprobe b43

this should kick in the correct driver

references: http://ubuntuforums.org/showthread.php?t=1859446

#lubuntu #drivers #linux

Jan 11: MX record woes

So happily I had set up A record to point mail.<domain>.com to Bluehost servers and the outbound mails were going on fine.

But wait.. the incoming mails were not being received - reason being - the MX records were not set properly. :/

So lesson learnt, when some other server tries to send email to another server, MX records are checked and further actions are then made. Its not that I didn't know it before. I did. The problem was that I 'thought' I had already done that and was ignoring that fact that I hadn't.

And I wasn't able to trace out the mails perfectly - on where they were going and how.

There is this tool - www.mtgsy.net that checks the email trace for you. Helpful to debug. :)

#mx-records #bluehost #email

jan 10: Conflicting drivers in Linux distros

So when I was configuring my wifi for the small atom powered netbook that I have, now running lubuntu very well, there is a handy tidbit that I want to share - conflicting drivers and how to tackle them.

So if you were experimenting with various drivers for one of your devices and were lost as how to go about testing each of them one by one, then you can use -

sudo modprobe -r 'driver1' 'driver2' ... then you can individually enable them as - sudo modprobe 'driver1' and check out what the heck was going on. new to drivers and a bit of configuration involved. :-/

#lubuntu #drivers #linux

jan 9: wireless on lubuntu + hp mini 110

This took some time and I came out with a better understanding of device drivers.

Some tidbits -

lspci, rfkill list all, lsusb, modprobe (-r) and a few more commands helped me triage what was happening. In general when you want to install a driver, first find out the device - "lspci -nn" -> gives hardware id.

Armed with this hardware id, you can go around and search for working device drivers. For me, the wifi was wrecking havoc, it is Broadcom 4312 chipset and my lubuntu was 12.04. Steps i followed -

1. Removed the existing drivers sudo apt-get remove bcmwl-kernel-source

2. Install this - sudo apt-get install firmware-b43-lpphy-installer b43-fwcutter (lpphy driver for lower power wireless chipsets - bcm4312)

3. Check if bcm43xx drivers are blacklisted - cat /etc/modprobe.d/* | egrep 'bcm'

4. If they are, uncomment the line where it says 'blacklist bcmxx' in the following file - sudo vim /etc/modprobe.d/blacklist.conf

5. Reboot - and you should be up and running with a smooth wifi.

#bcm4312 #broadcom #wifi #lubuntu #hp-mini

jan 8: Bootable SD card for your linux flavor

So I had an old HP mini which from I wanted to flush out windows completely and install a nice linux distro. Atom processor, 2 gig ram = LUBUNTU !

To install -

1. Use Unetbootin/ Universal USB installer to get your system on the card.

2. Try 1000 times to make it boot.

3. Try to write the MBR to the first sector in the card.

4. Try 1000 times again.

5. I don't know how it worked finally but I used this utility - mobaliveusb.exe to know if it is really bootable.

6. I plug the same god damn card in - and it boots.

#TODO : Does it really affect or what happened ? I'm in no mood to find out.

#sd card #bootable #lubuntu

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

jan 7: Configure email and hosting separately

I host my web application on some domain and have configured my e-mail accounts somewhere else. How did I make it work on my host, Bluehost ? 1. Point the A-record of the domain to the specific bluehost account IP. As per their support the IP is not supposed to change until the plan is changed or if upgrade occurs.

2. Assign the domain as secondary domain on Bluehost.

3. Now after the domain is assigned, just point the mail.domain.com A record to the earlier IP (so that your @ record may still point to the web app, hence serving the application from there)

4. Done ! Now the A record of your mail.domain.com points to Bluehost whilst all other subs/ naked point to the webapp (if you have configured that way).

5. Incoming and outgoing mail servers are - mail.bluehost.com (26 SMTP) mail.bluehost.com (IMAP and POP) Enjoy.

#bluehost #email #hosting

Jan 6: Two small linux commands ;)

Small but helpful -

du -hs /path/to/folder - human readable sizes of folders

uname -a - this tells you what flavor/version/kernel of your linux is

#linux #basic

Jan 5: Deployment, well done.

Startups have to ship fast. Real fast.

This means optimal deployment strategy is a must for devs to test real quick and never get that 'it works on my system' excuse.

Sidenote: I signed up for Business support on AWS to quickly go through ElasticBeanstalk deployments. Noticed 2 things - they mostly don't honor the ETA turnaround time of 1 hour strictly. And the LIVE chat takes too long to connect to a representative. Anyways, I digress.

So if you are using AWS, go ahead and use elasticbeanstalk. Use .ebextensions/*.config files to write deployment instructions via different commands - commands - before the app is extracted on the instance container_commands - after the app is extracted but before its deployed

If you want a platform agnostic deployment script you can have a look at Fabric.

#deployment

Jan 4: Logging ! Its a must

As far as I have understood after reading through various Python and Django logging articles, here is how those things work.

3 things -

1. Logger - this is the one which catches events to log.

2. Filter - This can be used to segregate the Logging events and categorising them.

3. Handlers - these guys get whats left of it - the actual events.

Lets get down to how they work -

Loggers -

In loggers - you tell what all ‘handlers’ would you associate with a specific logger. There are 3 loggers in Django v1.4 - django (catch all), django.request (all requests), django.db.backends (DB queries).

Handlers -

In additional to core python logging handlers - StreamHandler (sys.stdout, sys.stderr..), FileHandler, NullHandler, WatchedFileHaandler and so on, Django has one more class - AdminEmailHandler

Filters -

Most commonly used - django.utils.log.RequireDebugFalse This checks if settings.DEBUG=False and then only allows the events to pass.

Then its a simple matter of writing a dictionary and let things take care of themselves while you sip coffee.

References - http://docs.python.org/2/library/logging.handlers.html https://docs.djangoproject.com/en/1.5/topics/logging/ www.miximum.fr/bien-developper/876-an-effective-logging-strategy-with-django

#logging #django

Jan 3, 2014: Backup revisited

So it seems there was a problem with the backup method I posted yesterday.

Suppose you have 3 servers - S3, S2 and S1. You did well on S3 and S2 and now its time to do what I told you on S1. But what do see after installation ?

The two-way-sync dropbox has is a bummer. CLI doesn’t provide it out of the box. So on S1 you also get S3 and S2 files. Bandwidth - waste. Disk - waste.

So what to do ?

Write a small shell script ;) -

#!/bin/bash

cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86_64" | tar xzf -

.dropbox-dist/dropbox &

wget https://www.dropbox.com/download?dl=packages/dropbox.py

mv download?download?dl=packages%2Fdropbox.py dropbox.py

chmod +x dropbox.py

./dropbox.py exclude add S2 S3

Now dropbox will behave and exclude S2 and S3 directories on your Dropbox folder. Solved.

But a better approach would be to go with cron+s3cmd, maybe in future - coz its not quick and dirty so much, but is an elegant solution.

#backup dropbox

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Jan 2, 2014: Backup hack for servers

So you are on your server and you want to backup that important code directory. And do it FAST.

In pops rsync (if you have another machine to sync with). But no other machines are handy :P (oh no!)

How about exploiting Dropbox ! ;)

Step 1: Install dropbox. --------------------

32-bit: cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86" | tar xzf -

64-bit: cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86_64" | tar xzf -

Next, run the Dropbox daemon from the newly created .dropbox-dist folder.

Run it and configure

~/.dropbox-dist/dropboxd

Hopefully you have Dropbox up and running now !

Step 2: create sym link to the folder -------------------------------

ln -s /path/to/folder/that/you/want/to/sync/ ~/Dropbox/folder/name

TADA ! Backup on its way :)

1 Jan 2014: Footer year !

So its time to change the footer year,eh ? Would you do it yourself or...

Why not use a one liner to do it for you ? Here it is ! var date = new Date().getFullYear();$("#year").html(date); Plus I profiled the javascript code I wrote - it takes 0.0072 seconds to execute that line of code. Here is the code I wrote to profile this (it is highly unoptimized too)var start = new Date().getTime();for (i = 0; i < 5000; ++i) {var date = new Date().getFullYear();$("#year").html(date);}var end = new Date().getTime();var time = end - start;alert('Execution time: ' + time);

#hackery #javascript

Trending Blogs

Last Seen Blogs

365 days of code