I've been having fun using the new StackOverflow site for answering technical questions.
Here is a Python script I call 'StackOverlow Fight' (like Google Fight). It takes a list of tags and gets the count of questions that are tagged with each one.
As an example, here is StackOverflow fight between some popular programming languages:
#!/usr/bin/env python import urllib import re langs = ('python', 'ruby', 'perl', 'java') url_stem = 'http://stackoverflow.com/questions/tagged/' counts = {} for lang in langs: resp = urllib.urlopen(url_stem + lang).read() m = re.search('summarycount.*>(.*)<', resp) count = int(m.group(1).replace(',', '')) counts[lang] = count print lang, ':', count sorted_counts = sorted(counts.items(), key=lambda(k,v):(v,k)) sorted_counts.reverse() print sorted_counts[0][0], 'wins with', sorted_counts[0][1]
Output:
python : 733 ruby : 391 perl : 167 java : 1440 java wins with 1440
8 comments:
Add C# to the mix. Guaranteed it crushes Java.
I was correct -- C# hauls in 2209 questions.
Does this officially mean that C# is the most difficult language on StackOverflow? ;-)
Yes it is absolut true. C# wins. Change the 4. line to
langs = ('python', 'ruby', 'perl', 'java','c%23')
And PHP ;) It's close to Java.
shon@ubuntu:~/python$ date
Tue Dec 8 18:28:56 IST 2009
shon@ubuntu:~/python$ python stack_fight.py
python : 14226
ruby : 5932
perl : 3147
java : 27568
c%23 : 51962
php : 21305
c%23 wins with 51962
Cool post. I thought it would be fun to see how things have changed over the last 3 1/2 years since the last post on this. I also added other languages for fun:
Java has 255997 hits
PHP has 235259 hits
Javascript has 220234 hit
C# has 220234 hits
C++ has 130371 hits
.NET has 119156 hits
Python has 112801 hits
Objective-C has 87613 hit
SQL has 83174 hits
C has 61612 hits
Ruby has 47578 hits
Perl has 18683 hits
delphi has 15376 hits
Groovy has 4237 hits
Lisp has 1937 hits
Go has 891 hits
Pascal has 480 hits
Ada has 324 hits
Basic has 73 hits
Logo has 23 hits
NXT-G has 1 hits
Java wins with 255997
So Java is still on top, and StackOverflow is Overflowing with questions and answers!
I also modified your code a bit to a) printed the languages after you sorted the list
b) put a check in for a None exception I got a couple times when playing around with your fun script.
Here is the updated code:
import urllib
import re
langs = ('Python', 'Ruby', 'Perl', 'Java', 'C++', 'PHP', 'C', 'Go',
'Javascript', 'C#', 'Groovy', 'Objective-C', 'Basic', 'SQL',
'delphi', '.NET', 'Lisp', 'Pascal', 'Ada',
'Logo', 'NXT-G', 'Visual Basic')
url_stem = 'http://stackoverflow.com/questions/tagged/'
counts = {}
for lang in langs:
resp = urllib.urlopen(url_stem + lang).read()
m = re.search('summarycount.*>(.*)<', resp)
if m is None:
counts[lang] = count
else:
count = int(m.group(1).replace(',', ''))
counts[lang] = count
#print lang, ':', count
sorted_counts = sorted(counts.items(), key=lambda(k,v):(v,k))
sorted_counts.reverse()
for name,hcount in sorted_counts:
print name ,"has",hcount,"hits"
print ''
print sorted_counts[0][0], 'wins with', sorted_counts[0][1]
Great idea Corey, lets see where things stand June 2012:
Java has 255997 hits
PHP has 235259 hits
Javascript has 220234 hits
C# has 220234 hits
C++ has 130371 hits
.NET has 119156 hits
Python has 112801 hits
Objective-C has 87613 hits
SQL has 83174 hits
C has 61612 hits
Ruby has 47578 hits
Perl has 18683 hits
delphi has 15376 hits
Groovy has 4237 hits
Lisp has 1937 hits
Go has 891 hits
Pascal has 480 hits
Ada has 324 hits
Basic has 73 hits
Logo has 23 hits
NXT-G has 1 hits
Java wins with 255997
So Java is still 'on top', and wow, what an lot of questions and answers in 2 1/2 years! StackOverflow is overflowing for sure. ;)
I also made a few small changes to your script: a) added more languages, languages are printed in order of hit count, added a check for a None I got twice when playing around with the language list; not sure why though. here is the update, hopefully it displays OK:
import urllib
import re
langs = ('Python', 'Ruby', 'Perl', 'Java', 'C++', 'PHP', 'C', 'Go',
'Javascript', 'C#', 'Groovy', 'Objective-C', 'Basic', 'SQL',
'delphi', '.NET', 'Lisp', 'Pascal', 'Ada',
'Logo', 'NXT-G', 'Visual Basic')
url_stem = 'http://stackoverflow.com/questions/tagged/'
counts = {}
for lang in langs:
resp = urllib.urlopen(url_stem + lang).read()
m = re.search('summarycount.*>(.*)<', resp)
if m is None:
counts[lang] = count
else:
count = int(m.group(1).replace(',', ''))
counts[lang] = count
sorted_counts = sorted(counts.items(), key=lambda(k,v):(v,k))
sorted_counts.reverse()
for name,hcount in sorted_counts:
print name ,"has",hcount,"hits"
print ''
print sorted_counts[0][0], 'wins with', sorted_counts[0][1]
@Sol. cool!
Post a Comment