This is Google's cache of viewtopic.php?p=24197. It is a snapshot of the page as it appeared on Mar 25, 2011 01:37:53 GMT. The current page could have changed in the meantime. Learn more

Text-only version
These search terms are highlighted: hoopstudies  
APBRmetrics :: View topic - Computer Programming
APBRmetrics Forum Index APBRmetrics
The statistical revolution will not be televised.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Computer Programming

 
Post new topic   Reply to topic    APBRmetrics Forum Index -> General discussion
View previous topic :: View next topic  
Author Message
jmethven



Joined: 16 May 2005
Posts: 51

PostPosted: Sat Dec 27, 2008 6:05 pm    Post subject: Computer Programming Reply with quote

I think it makes sense to start a thread here where we can share advice and ask questions about making good use of computer programming in gathering data. I, for one, am tired of manually entering data into Excel spreadsheets when I embark on any statistical projects. I am aware of a lot of basic tricks with Excel - web queries, text to columns - but I don't really know anything more advanced than that.

How do other posters here go about gathering data? I've always wondered how, to give one example, Ken Pomeroy can apparently press a button and generate up-to-date player and team data for his website. I suspect that it is not a simple process and requires some real programming know-how, but maybe I am wrong.

So any general tips that anyone has to offer? At the moment one of my goals is to parse college basketball play-by-play data to calculate plus-minus. That might be a difficult endeavor, but there are simpler things that I would like to be able to do as well.
Back to top
View user's profile Send private message
Ryan J. Parker



Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC

PostPosted: Sat Dec 27, 2008 6:13 pm    Post subject: Reply with quote

I've never asked Ken directly, but I suspect he, like myself, uses perl for a lot of data collection tasks.

There are some nice perl modules that help the process, but actually getting the data is fairly easy compared to actually parsing it. There are some other modules that can help here as well, but really you just need to learn the ins and outs of perl regular expressions, loops, hashes, arrays, etc.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
daniel



Joined: 27 Dec 2008
Posts: 1

PostPosted: Sun Dec 28, 2008 3:02 am    Post subject: Reply with quote

I was considering writing a program in perl to grab/parse/process data. I could do it, I just don't know if I have the time. I also don't know the legal ramifications of distributing a program capable of culling data from established sources.

If anyone wants to work with me on this project, please PM!
Back to top
View user's profile Send private message
crimsonc



Joined: 16 Dec 2008
Posts: 4

PostPosted: Sun Dec 28, 2008 4:05 am    Post subject: Reply with quote

I've got Java code I've written that can harvest statistical data from websites. Saves it as csv files for easy import into OO Calc or Excel. All the actually stat generating is done in Java. I just use spreadsheets as a convenient display format till I get around to writing a Java front end.

Took about two weeks to develop for WCBB and I'm currently adding MCBB in as well. There is nothing wrong with writing sourcecode to collect stats. There maybe something wrong with actually running said code. I know the ESPN site terms prohibit any 'automated' use of their site, which I would assume includes stat harvesting. Though you are allowed to download a copy for personal use, so presumably, you could download it and then harvest it to your hearts content. But obviously any commercial use of that is illegal. I'm not using ESPN as my source, since their WCBB coverage is crap.
Back to top
View user's profile Send private message
HoopStudies



Joined: 30 Dec 2004
Posts: 705
Location: Near Philadelphia, PA

PostPosted: Sun Dec 28, 2008 1:46 pm    Post subject: Reply with quote

I hate perl, but that is a personal preference. Visual Basic and python tend to be my tools of choice. VB can automate things with excel, too.
_________________
Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers.
Back to top
View user's profile Send private message Visit poster's website
jkubatko



Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Sun Dec 28, 2008 9:38 pm    Post subject: Reply with quote

HoopStudies wrote:
I hate perl, but that is a personal preference.


Perl rules. :-)
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Serhat Ugur (hoopseng)



Joined: 13 Oct 2006
Posts: 208
Location: Basketball Research

PostPosted: Mon Dec 29, 2008 4:04 pm    Post subject: Reply with quote

I used to use excel but I hired a freelancer from www.guru.com for my website.

It saved me a ton of time and I think it was cheap. In no way I see myself getting involved to programming.
_________________
http://www.nbastuffer.com
Back to top
View user's profile Send private message Visit poster's website
basketballvalue



Joined: 07 Mar 2006
Posts: 208

PostPosted: Mon Dec 29, 2008 4:33 pm    Post subject: Reply with quote

jkubatko wrote:
HoopStudies wrote:
I hate perl, but that is a personal preference.


Perl rules. Smile


Yes, count me as pro-perl. I've written basketballvalue.com using perl and MySql (with R for generating the adjusted +/- results) to download and process the data, and then php to display the data on the web out of the MySql database.

Thanks,
Aaron
_________________
www.basketballvalue.com
Follow on Twitter
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    APBRmetrics Forum Index -> General discussion All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group