Thursday, August 04, 2011

Tuesday, May 12, 2009

Java Heap Analysis Using Jhat after OOM

OutOfMemory in Java: Sometimes garbage Collector can't collect everything and JVM cries ..

Using jmap and jhat to get heap dump and analyze dump.

http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/memleaks.html

Check here how to query the heap

If your heap dump size of 2GB+ you may not be able to run jhat server on machine with less memory. In that case try and use amazon ec2 instance.

Monday, December 29, 2008

zimbabao@Twitter

Follow me @ Twitter

Wednesday, November 12, 2008

Stopping and starting a process on Unix/Linux

#Stop the process with process it (equivalent of ctrl-z)
kill -s SIGSTOP processid

#Start the process again
kill -s SIGCONT processid

#get the list of back ground jobs
jobs

#get a process on frontend
fg // job id will be listed in output of job

fg is equivalent to fg %1

#bg
bg is equivalent to bg %1

Wednesday, October 01, 2008

Sar utility on Linux/Unix

At Yahoo! I was using something called Ysar to get the historical system resource performances.

For others the same utility is called sar.
For more about this utility check this link

Installing it using Yum

root#yum install sysstat

than initialize the sar by running

root#/usr/sbin/sa

Wait for sometime(20 mins) and check that you have done things correctly.
root#sar -A

you will see the output like


08:40:01 AM proc/s
08:50:01 AM 0.17
Average: 0.17

08:40:01 AM cswch/s
08:50:01 AM 3364.52
Average: 3364.52

08:40:01 AM CPU %user %nice %system %iowait %steal %idle
08:50:01 AM all 4.66 0.00 1.55 12.26 0.00 81.53
08:50:01 AM 0 0.22 0.00 0.22 0.21 0.00 99.35
08:50:01 AM 1 0.10 0.00 0.07 0.22 0.00 99.61
08:50:01 AM 2 3.86 0.00 1.23 37.90 0.00 57.01
08:50:01 AM 3 14.44 0.00 4.69 10.71 0.00 70.16
Average: all 4.66 0.00 1.55 12.26 0.00 81.53
Average: 0 0.22 0.00 0.22 0.21 0.00 99.35
Average: 1 0.10 0.00 0.07 0.22 0.00 99.61
Average: 2 3.86 0.00 1.23 37.90 0.00 57.01
Average: 3 14.44 0.00 4.69 10.71 0.00 70.16

08:40:01 AM INTR intr/s
08:50:01 AM sum 3869.91
Average: sum 3869.91

08:40:01 AM CPU i000/s i001/s i008/s i009/s i012/s i050/s i066/s i074/s i217/s i225/s i233/s
08:50:01 AM 0 1000.27 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
08:50:01 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 112.25 0.00 0.00 0.00
08:50:01 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 92.54
08:50:01 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 2664.85 0.00 0.00 0.00 0.00
Average: 0 1000.27 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 112.25 0.00 0.00 0.00
Average: 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 92.54
Average: 3 0.00 0.00 0.00 0.00 0.00 0.00 2664.85 0.00 0.00 0.00 0.00
..... and lot more.

Wednesday, June 18, 2008

Developing for web on machine with no IE (Mac, Linux)

Testing is integral part of development.
And you don't have IE on Mac OS X. This was a big trouble for me. I had to ask fellow developers to test my code for IE.

So to solve this problem I Installed free desktop virtualization monitor.
Now I'm running Windows XP as process on my Mac Box. And try and fix IE bugs.

Get VirtualBox here.
http://www.virtualbox.org/wiki/Downloads

VirtualBox works on any x86 hardware and on many host Operating Systems like Windows and several Linux distributions.

Monday, June 16, 2008

Shantanu asked me to do so ...

This is pretty simple, just write your set of answers for the questions here!

1. What is your middle name?
Suresh

2. How big is your bed?
I sleep on floor

3.What are you listening to right now?
hummm of cpu fan of a box in my office

4. What was the last thing you ate?
Roti and curry

5. Last person you hugged?
My friend and classmate from college I met after 5 years ..

6. How is the weather right now?
Sunny

7. Who was the last person you talked to on the phone?
Nitin -- we discussed something just before deployment ..

8. The first thing you notice about the opposite sex?
feet and face ...

9. Favorite type of Food?
Fish, dal roti, pizza, ice cream. I actually eat anything.

10. Have you ever cried over a love lost?
No ..

11. Last Movie you watched?
Forest Gump ..

12. Do you have any piercings?
No

13. Favorite Movie?
Many: Godfather-1,2, Jane bhi do yaaron, Munnabhai MBBS and few more.

14. What were you doing before filling this out?
Reading RGV's blog and Aish comments on it

15. Have you ever loved someone?
Yes


16. Who would you like to see right now?
Mother and father.

17. What color are your bedroom walls?
Cream, I love that color coz you can't make out dust on the walls.

18. Have you ever fired a gun?
Yes. NCC camp.

19. Do you like to travel by plane?
Yeah. I love to see childlike wonder in people asking me to exchange my window seat with them.

20. Right-handed or Left-handed?
Right-handed.

21. If you could go to any place right now where would you go?
Goa, Konkan Coast, Everest .


22. Do you still watch cartoons on Saturday mornings?
Sometimes (my roommates watches them, so I have to)

23. What is the wallpaper on your cellphone?
Road in the wilderness

24. Favorite hangout:
My home, TGIF, Cubbon Park

25. 3 things you can't live without?
Run, computer, book.

26. Favorite songs?
Another brick in the wall, Coming back to life

27. What are you afraid of?
Failure, Tractor I collided with on my cycle.

28. What are your nicknames?
Sagar, Zimba, Zimbeshwar, Carlos, raju, Guruji, Goa.

29. Stuck on a deserted island, and can only bring one thing?
Pizza with extra cheese

30. First thing you'll save in a fire?
My family

31. What is your favorite color?
Blue

32. What are the things you always bring with you?
Home key

33. What did you want to be when you were a kid?
Footballer and egineer (I don't play well so I became a engineer)

34. What do you usually do when the alarm turns on?
Get up ..

35. What do you think about before you go to bed?
Why I have to do this?


I'm Tagging everybody who reads this blog.

Thursday, May 29, 2008

How not to customize the wordpress blog?

Check out Amitabh Bacchan's blog by BigAdda.com.

They have biggest header I've ever seen (947pixel x 964pixel). This occupies whole of viewport (Not a good UE/UI guideline).
User have to scroll down to get to the actual content (for which they are coming for).

Such absurd approaches are ok with front page design in the name of "branding" or something similar .
But surely avoidable on individual post pages.(which bigadda didn't do).

Wednesday, May 21, 2008

Lambi Judai -- Old song from Hero and new one from Jannat

Hero


Jannat

Champions League Final 07/08 at १२:15am (२२ May २००८) and Ji Sung Park


Sea of Red at Red Square Moscow must be biggest Red Gathering after collapse of Soviet. Red Devils playing Deep Blues, a all English fight for European championship.

Live telecast will start at 12:15 am IST (22nd May 2008, early morning).

Chelsea owner Roman Abramovich will be happy to see his team playing in his place of birth.

Some Trivia:
1. This is the only 3rd time in Champions Trophy final both the teams are form same country.
२. Ji Sung Park will be the first Asian to play in Champions League finals.

Wednesday, April 30, 2008

Talk on fundamentals of AJAX usability


'

This talk was given in GWT Conference by Kelly Norton

Condoleezza Rice blames Asians for food shortage

US Secretary of State Condoleezza Rice tried to defend its government over biofules issue. Diversion of food in production of biofuel is causing global food shortage. In 2005 Bush administration embraced biofuels. Subsidized production of corn for ethanol reduced the amount of food grain cultivated. Result prices of food grains went up. They signed pact with Brazil in intention to form the "OPEC" like cartel of "ethanol fuel producers" (source).

Now she comes up with a master stroke blame Indians and Chinese for rise in food prices. According to her
"growing Indian and Chinese appetite is contributing to the global food crisis".(source). I personally feel that this comment is "politically incorrect" and some what heartless. What she is saying may be partially correct (although Montek SIngh Alluwalia is not agreeing with her). It looks like she making these comments to defend polices formed by her Government.

Monday, April 28, 2008

Amazon webservices tools (EC2 and S3)

1. Firefox plugin to visually check the EC2 a/c. You can see the available AMIs, see running instances, shutdown running instances, start the running instances.

check this link for more details
Elasticfox - firefox plugin

I found this tool very useful and saves lots of time.

2. Command line tool S3 tool for creating buckets, pushing data to S3 servers.
s3sync

This tool is very good in writing simple scripts to backup/store/bulk upload scripts etc.

3. S3fox - as name suggests firefox plugin for s3
s3fox

Tuesday, April 22, 2008

Dilbert.com flash widget. Good example for How to screw up the UI.




To read a strip you need 3 clicks per strip. Looks like UI designer forgot fundamental rule of design (clicks are expensive).

Monday, April 14, 2008

Maoist Power in Nepal and India

Yesterday evening when I heard the news of Communist Party of Nepal (Maoist) winning most of the results declared till now, first thought struck to my mind was this may be a moral booster for Maoist in India (Naxalites as they popularly known as).

WIth Maoist in power in Nepal cross border movements from Naxal affected states of India which are near Nepal like Bihar, Jharkhand,West Bengal, Madhya Pradesh and some parts of UP will increase. This will be of some concern to India. Although CPI-Maoist have very small cadre in India they do control parts of Andhra Pradesh and Jharkhand.

This will be an interesting twist in India's relationship with Nepal, we have to look forward how Indian diplomats handle this when most of India politicians busy with next General Elections.

Sunday, April 06, 2008

Its Usability and not being funky

I stumbled upon this talk by Azo Raskin at Google Talks.
A must watch if you are involved with writing computer softwares and expects people to use it.



Also see this Sample Chapter from book Don't make me think

Sunday, March 16, 2008

Economics in one page

I've been following economics blogs and articles for about a year now.
I found it a very interesting subject just because it could keep me interested in it for more than a month.

After a year I found a good article on fundamentals of economics.

http://thinkingonthemargin.blogspot.com/2007/09/economics-in-one-page.html

Monday, January 14, 2008

getAttribute -- problem on IE (Internet Explorer 6/7) -- YUI is affected

getAttribute function on yahoo has problems with form( till now I've found it with form element only).

<html>
<head>
</head>
<body>
<script>
function getAtr(){
alert(document.getElementById("mform"));
alert(document.getElementById("mform").getAttribute("action"));
};
window.onload=getAtr;
</script>
<form action="hello" id="mform">
<input name="action" value="i m not taking action"/>
</form>
</body>
</html>


Save above code in some file and open it with IE/FF/Safari.
Expected output is an alert saying "hello". But IE 6/7 prints object.

The culprit is input tag which is child of form. The name of the input tag is action and this input object is printed in alert.
Solution: Not to use action as input parameter in a form (specially when you will be manipulating the form in JavaScript.

YUI dialog is affected by this bug of IE.

Friday, December 28, 2007

Pulsar 200cc starting problem - "Pigtail clutch switch" malfunction

I have a Pulsar 200cc, bought 10 months back. I love the bike. Since last 2 weeks I started getting problems while starting. Sometimes electric start was not working. After 2-3 trials it used to work so I ignored it initially. Yesterday problem became serious and it didn't start after repeated trials for 20 mins. "Dhakka start" was the only option as no kick start (The serious problem with the bike).

I showed it to my dealer(Khivraj Motors, Bangalore) and they told me that its a known issue with pulsar 200cc. The "Pigtail clutch switch" was not working.
While replacing the clutch and fixing starting problem they didn't connected the battery properly and I had to go back to the dealer with a "Dhakka start".

Thursday, December 27, 2007

Sitaron se aage jahan aur bhi hain -- Allama Iqbal

"Sitaron se aage jahan aur bhi hain
Abhi Ishq ke imtehan aur bhi hain
Tahi zindagi se nahin yeh fazayen
Yahan sankaron karwaan aur bhi hain
Qana'at na kar aalim e rang o boo par
Chaman aur bhi, aashian aur bhi hain
Agar kho gaya ek nasheman tau kya gham
Mukamat e aah o faghan aur bhi hain
Tu shaheen hai, parwaz hai kaam tera
Tere saamne aasman aur bhi hain
Isi roz o shab mein ulajh kar na reh jaa
Ke tere zaman o makan aur bhi hain
Gaye din ke tanha tha main anjuman mein
Yahan ab mere raazdan aaur bhi hain"

------------------------
Literal Translation
-------
There is world beyond the stars
Now there are more exams of love
This world is not without bereft of life
There are hundreds of caravans here
Don't be satisfied by knowledge of colour and smell
there are more flower gardens and more homes
If you have lost one nest than whats problem
There are more places for lamenting
You are a falcon, flying is your job
There are more skies ahead for you
Don't get lots in this days and nights
That you have more land and house.
Gone are those days when I was lonely in gatherings,
Now here I have more other friends.
---------------------------
Meaning
--------
There is more to life.
Do not get budged by failures,
Look ahead for more opportunities.

Thursday, December 20, 2007

Signals in Social Supernets

"Social network sites (SNSs) provide a new way to organize and navigate an egocentric social network. Are they a fad, briefly popular but ultimately useless? Or are they the harbingers of a new and more powerful social world, where the ability to maintain an immense network—a social "supernet"—fundamentally changes the scale of human society? This article presents signaling theory as a conceptual framework with which to assess the transformative potential of SNSs and to guide their design to make them into more effective social tools. It shows how the costs associated with adding friends and evaluating profiles affect the reliability of users' self-presentation; examines strategies such as information fashion and risk-taking; and shows how these costs and strategies affect how the publicly-displayed social network aids the establishment of trust, identity, and cooperation—the essential foundations for an expanded social world."

http://jcmc.indiana.edu/vol13/issue1/donath.html

Thursday, October 11, 2007

Usability on web


Usability can be defined as the degree to which a given piece of software assists the person sitting at the keyboard to accomplish a task, as opposed to becoming an additional impediment to such accomplishment.


http://stats.bls.gov/ore/htm_papers/st960150.htm

Friday, October 05, 2007

Yahoo! Heck Day .......

It started and I met David Filo ,,,

Christian Heilmann talking about Yahoo heck days ...

Tuesday, September 04, 2007

Thukra ke usne mujhko

Thukra ke usne mujhko,
kaha ki muskuraao!
Maine has diya,
aakhir sawal uski khushi ka tha.
Maine khoya woh jo mera tha hi nahi,
Usne khoya wo jo sirf usi ka tha.

Sunday, July 08, 2007

Combining Social Network and personal network as recruiting tool

I've been working in software industry for last 4 years as engineer. Since most work in a software industry is of technical nature recruiting persons with right skills becomes very important.

Strong technical background is fundamental requirement, it can tested much easily because of binary nature of the tests.

Broad areas typically tested in technical interviews are
1. Fundamentals knowledge of technology.
2. Strong problem solving skills.
3. Learning skills.

General question to be answered during any interview by interviewer is
"Would you like to work with this person?"
Now this is the tough question. Several interviewers ask this question to themselves about a candidate and take the decision. While answering this question they have to also answer "whether my teammates would like to work with this person?".
These questions are most difficult to answer as you meet that person only for an hour, lots of chance of making an error of judgment and you may have a false opinion about a person.

Few things that helps us decide about a person are
1. Attitude of person toward other people, life, work.
2. Professional maturity level. (For experienced candidate).
3. Ethical values.
4. Verification of facts stated in resume.

Now how do one can find it about a person.

Few things generally companies do are
1. Ask candidates for reference:
This solution has its own problems. Candidate can easily control the information shared by such a source.
2. Employ a detective agency to do the background work.
This is a very costly option and is surely a overkill.
3. Get in touch with teachers from the processor.
4. Some professional companies do this work as there primary job.

With the advent of phenomenon of "Social Networking" like orkut and linkedin.
We have one more way of finding about a person.
"Through mutual friends", "Finding in which forums the person participate".
Information shared through mutual friend will have more trust value.
Activity partners or people who play some game with candidate can give lots of valuable information about attitude, team spirit, ethics. Old colleagues/classmates can give you more information about professional maturity, smartness, ambitions, team spirit, work and even technical strength.

Out of all sources "Social Networking" phenomenon and use of personal network to find about a person is low cost, easy and trusted way of recruiting right candidate.

Friday, July 06, 2007

Information Extraction Open Source Software

Information extraction is still very big problem in any text-mining application.
Following are the links to various softwares available.

I've not used any of them, but wrote some software for extraction at my work.

http://uima-framework.sourceforge.net
http://www.research.ibm.com/UIMA/

and

http://gate.ac.uk

Monday, April 16, 2007

Wednesday, March 07, 2007

Is it a Tree?

Given number of nodes of a directed Graph (n and named from 1 to n),
and all the edges in the graph of the form (a,b) where a is the start and b is the end of the edge.

Can you suggest a algorithm to check its a tree or not.

Monday, February 26, 2007

Probability: 3 pieces of stick can for a Triangle?

What is the probability that a stick (size 10cm) broken into 3 pieces forms a triangle?.

Friday, January 26, 2007

Great lines from "The Godfather" by Mario Puzo

1. Behind every great fortune three is crime behind.
2. When man was generous, he must show the generosity as personal.
3. I gave him the offer he couldn’t refuse.
4. Negotiations
Never get angry
Never make a threat. Reason with people.
5. Never let anyone outside family know what you are thinking. Never let them know what you have under your fingernails
6. Its business, not personal.
7. It’s all personal, every bit of business. Every piece of shit every man to eat every day of his life is personal. They call it business.
8. Accidents don’t happen to people who take accidents as a personal insult.
9. There are things that have to be done and you do them and you never talk about them. You don’t try to justify them. They can’t be justified. You just do them. Then you forget it.
10. Every man has but one destiny.
11. I’ll reason with him.
12. Man of respect
13. Man of reasonableness.
14. Great man are not born great, they grow great.
15. Lawyers can steal more money with briefcase than a thousand men with guns and masks.
16. There is no greater natural advantage in life than having an enemy overestimate your faults, unless it is to have friend underestimate your virtues.
17. Any profession was worthy of respect to men who for centuries earned bread by the sweat of their brows.
18. That’s life, everyone here could tell his own tale of sorrow.
19. Made his bones.
20. You cannot say ‘no’ to people you love, not often.
21. Revenge is the dish that taste best when it is cold.
22. If I can die saying “Life is so beautiful”, then nothing else is important.
23. Never stand for any outsider against your family.
24. A man who doesn't spend time with his family can never be a real man.

Monday, January 08, 2007

6174 - mystery


The number 6174 is a really mysterious number. At first glance, it might not seem so obvious. But as we are about to see, anyone who can subtract can uncover the mystery that makes 6174 so special.

Tuesday, January 02, 2007

Shared Development: Character encoding detection

In my last project I did lots of similar work.
I wrote about it hear
Following article also provides more similar information.
Shared Development: Character encoding detection

Tuesday, December 19, 2006

Wednesday, November 08, 2006

Character Set Encoding Detection -- Part 1

Character set encoding detection becomes necessary when you starts working on processing non-English text.

I started working on south-east Asian languages a year back and I had to port some code. This particular code was working fine for English text and never gave any problems from some European non-English languages.

But when I started on working on SEA languages, I knew before starting that encoding issues will make our life hell and really it did.

Most of the softwares were just supporting English in olden days and that lead to common myth of 1byte=1char.
It takes time to digest things like characters bigger then one byte and character stream with characters of variable lengths.

Then comes the issue of which byte sequence is which character. Several countries follow different encodings, if one just gets some text as a stream of bytes and have no idea about the encoding, then there is small chance that this text will be processed correctly.

Character Set and Character encoding are the two generally interchangeably used concepts but sometimes they mean different things.

Character Set: Just a collection of characters
eg. Kannadda characters, Devanagri Characters, Japanese Characters, English alphabets
Character Encoding: Mapping a character from a character set to a numerical value.
eg. UTF-8, UTF-16, EUC-JP, EUC-KR, ISO-8859-1 to 7, ISO-2022-JP

European languages have less characters which can be fit in single byte space and so most of the European languages use ISO-8859-[1-7] character encoding.
But SEA languages, they are commonly referred as CJKV (Chines,Japanese, Korean and Vietnamese)
Best Reference for CJKV

And there are attempts made to standardize the character set, encodings
1. Unicode
2. Wikipedia Unicode link

Its very clear that different publishers have their personal choices in using different character encoding. Actually most of them are not aware of it. They just give out in default working encoding.

Now when somebody browses or crawls your page, he need to know the encoding of the text sent by you to read or programmatically process it properly. Here comes the problem of unknown encoding.
When browser or your program should do when it faces such issue. Most browser tries to detect the encoding and use. Detecting the encoding is not exact science but it works well for most of the pages.

In simple words this detection is done by checking the occurrence of certain patterns in the byte stream.
The detector which mozilla provides works better if you set the detector to detect encodings common to your language.

-----------
References
1. http://www.mozilla.org/projects/intl/chardet.html
2. http://sourceforge.net/projects/icu/

Sudoku Solver in C++ and Lisp

These are weekend hacks and may be inefficient, if you see any possible improvement
leave a comment.

C++ code.

Following lisp code is using functional style of programming. And I wrote this when I was re-starting the lisp.
So most code is using car,cdr,cons, if ,quote and some of the functions are inefficient.
Lisp Code

Friday, September 29, 2006

Lisp - Strings and Characters

Lisp has an ackward syntax for characters.
In Lisp string is a array of characters. According to HyperSpec'd defination of String

" A string is a specialized vector whose elements are of type character or a subtype of type character. When used as a type specifier for object creation, string means (vector character)."

The simple example of Lisp string are
"hello lisp"
"this post is about lisp strings and characters"

Now lets see some of the string functions
1) string-concat
This function as expected concates all the strings passes as arguments
(string-concat)
""
(string-concat "Hello")
"Hello"
(string-concat "Hello" " ")
"Hello "
(string-concat "Hello" " " "World")
"Hello World"

2)subseq
This function is general function for any kind of sequences, like list, arrays
(subseq "Hello" 2 4)
"ll"

3)stringp
This function checks if given argument is string or not
(stringp "hello")
T
(stringp '(1 2 3))
NIL

4) string-equal is a function which compares two strings with case ignored.
(string-equal "hello" "Hello")
T

(string-equal "hello" "hel")
NIL

5)reverse works on any sequnce
(reverse '(1 2 3))
(3 2 1)

(reverse "hello")
"olleh"



Characters forms string
Lets see the 3rd charcter of string "hello"
(char "hello" 3)
#\l

1) Actually each charcter is associated with a number, one can find that number by
using function char-code

(char-code #\a)
97
In this case the character code returned is ASCII, but it is code use by implementation.
(char-int #\a)
97

2) digit-char-p checks if a character passed as argument is digit or not
(digit-char-p #\a)
NIL
(digit-char-p #\9)
9
(digit-char-pa #\1)
1

4)code-char returns a character for given code
(code-char 123)
#\{

5) Some special chracters
#\Tab
#\Newline
#\Space

All of above are case-insensitive


clisp supports unicode and using function code-char one can easily get
representation of any code.

Friday, August 18, 2006

AOL search Queries -- Most Common Websites Searched

My analysis of AOL search data continued. Many people use web search to search the websites.
Following are the most frequently searched websites.
  1. google.com 568668
  2. yahoo.com 455047
  3. myspace.com 381046
  4. ebay.com 225424
  5. aol.com 169733
  6. askjeeves.com 161628
  7. mapquest.com 120475
  8. disneychannel.com 113156
  9. msn.com 76905
  10. hotmail.com 64289
  11. pogo.com 59658
  12. bankofamerica.com 34671
  13. chase.com 33306
  14. cravelyrics.com 16526

Most common links for keyword "google.com"
  1. http://www.google.com 348057
  2. http://earth.google.com 5474
  3. http://maps.google.com 4736
  4. http://images.google.com 2296
  5. http://news.google.com 1670
  6. http://www.google.co.uk 1607
  7. http://toolbar.google.com 1111
  8. http://video.google.com 707
  9. http://groups.google.com 637
  10. http://scholar.google.com 536
  11. http://googleblog.blogspot.com 510
  12. http://www.google.com.au 430
  13. http://www.googlecom.com 397
  14. http://www.google.ca 383
  15. http://directory.google.com 319

AOL search queries - "How" question are more popular




AOL Published the search data of 500k+ (657k+) "randomized" users. The intentions were good.

Because of all the privacy concerns and reports like one on NYTimes.

I just played with this data to find some general information. I tried to aggregate some data bout question people ask to search engine.

One thing is clear that people like to ask "How" very much. Out of around 21 Million uniq queries fired, around 0.16Million were questions (how what where when which). Out of these 50+% are "How" queries, see attached graph or table below.

Quetion Type count
  • how 91397
  • what 57659
  • where 11630
  • when 9600
  • which 1729
Total 168452 uniq question queries.
Out of all these question 98424 queries generated any click. (58.43%)
All together 216007 number of clicks generated (1.28 clicks / query)

Page Clicks % age
  • 1 175871 81.42
  • 2 22723 10.52
  • 3 7152 3.31
  • 4 3062 1.42
  • 5 1805 0.84
  • 6 1200 0.56
First page results still solves most of the questions.