Google Search Scraping With Python

Python is a language that allows you to do great things with very little code, it has a great set of powerful libraries and packages. I hope to illustrate this here by demonstrating how you can scrape results off a google search using a very simple and short python script. Older versions of such scripts were dependent on the ajax google api which no longer work, this is an alternative approach.

The way this piece of code works is by using the two modules ‘urllib’ and ‘requests’. These two modules are at the centre of this piece of code. The ‘get’ function of the ‘requests’ module is what allows you to access the specified url and the ‘urllib’ module allows you to read the urls on the page and store/output them.

For this code to work, you will also need the lxml library and the CSSselect python package. These are needed to process the formatting of the results page. lxml does not need any installation and is widely used in python scripts. You can download their package, and read their documentation here:

Now for CSSselect, you might get this error if the package is not installed on your system:

To fix this you might want to download the CSSselect package, which you can do from here:

To install this package run this command from the directory where the downloaded .whl file is located:

After doing so, you can run the script and/or use it in your own programs to scape off google search results. Have fun!

Rules of Language

It is said that rules are meant to be broken when it comes to language, yet there exist Grammar Nazis.
It is frequently suggested by experienced experts to keep it simple, yet simplicity is associated with amateurity.

Language is meant to be a means of communication, advice that it does not require to follow rules is something that is often suggested, but rarely appreciated. Sometimes people feel annoyed at the improper use of the comma or underwhelmed by the simplicity of language used. William Shakespeare did not invent half the words we use today by simply following the rules, the suggestive fact that this generation of laureates refuses to break the rules can effect the future of language.

But is that true? I mean the internet community sure does not follow any rules, it creates treads once in a while, and flips it around again when they feel like it. Change is inevitable in the internet community. Society as a whole is deeply influenced by the language it uses, although English is now being widely used other languages are not easily suspectable to change.

Rules and Regulations are required for keeping chaos out, but it should not prevent you from adding a pause when you feel like it using a comma.

P. S.  I don’t know if there is a word called “amateurity” I used in the first paragraph, maybe it exits maybe it does not. The thing is, nobody stopped me from using it!

Language has no rules.