View previous topic :: View next topic |
Author |
Message |
jmethven
Joined: 16 May 2005 Posts: 51
|
Posted: Sat Dec 27, 2008 6:05 pm Post subject: Computer Programming |
|
|
I think it makes sense to start a thread here where we can share advice and ask questions about making good use of computer programming in gathering data. I, for one, am tired of manually entering data into Excel spreadsheets when I embark on any statistical projects. I am aware of a lot of basic tricks with Excel - web queries, text to columns - but I don't really know anything more advanced than that.
How do other posters here go about gathering data? I've always wondered how, to give one example, Ken Pomeroy can apparently press a button and generate up-to-date player and team data for his website. I suspect that it is not a simple process and requires some real programming know-how, but maybe I am wrong.
So any general tips that anyone has to offer? At the moment one of my goals is to parse college basketball play-by-play data to calculate plus-minus. That might be a difficult endeavor, but there are simpler things that I would like to be able to do as well. |
|
Back to top |
|
|
Ryan J. Parker
Joined: 23 Mar 2007 Posts: 708 Location: Raleigh, NC
|
Posted: Sat Dec 27, 2008 6:13 pm Post subject: |
|
|
I've never asked Ken directly, but I suspect he, like myself, uses perl for a lot of data collection tasks.
There are some nice perl modules that help the process, but actually getting the data is fairly easy compared to actually parsing it. There are some other modules that can help here as well, but really you just need to learn the ins and outs of perl regular expressions, loops, hashes, arrays, etc. _________________ I am a basketball geek. |
|
Back to top |
|
|
daniel
Joined: 27 Dec 2008 Posts: 1
|
Posted: Sun Dec 28, 2008 3:02 am Post subject: |
|
|
I was considering writing a program in perl to grab/parse/process data. I could do it, I just don't know if I have the time. I also don't know the legal ramifications of distributing a program capable of culling data from established sources.
If anyone wants to work with me on this project, please PM! |
|
Back to top |
|
|
crimsonc
Joined: 16 Dec 2008 Posts: 4
|
Posted: Sun Dec 28, 2008 4:05 am Post subject: |
|
|
I've got Java code I've written that can harvest statistical data from websites. Saves it as csv files for easy import into OO Calc or Excel. All the actually stat generating is done in Java. I just use spreadsheets as a convenient display format till I get around to writing a Java front end.
Took about two weeks to develop for WCBB and I'm currently adding MCBB in as well. There is nothing wrong with writing sourcecode to collect stats. There maybe something wrong with actually running said code. I know the ESPN site terms prohibit any 'automated' use of their site, which I would assume includes stat harvesting. Though you are allowed to download a copy for personal use, so presumably, you could download it and then harvest it to your hearts content. But obviously any commercial use of that is illegal. I'm not using ESPN as my source, since their WCBB coverage is crap. |
|
Back to top |
|
|
HoopStudies
Joined: 30 Dec 2004 Posts: 705 Location: Near Philadelphia, PA
|
Posted: Sun Dec 28, 2008 1:46 pm Post subject: |
|
|
I hate perl, but that is a personal preference. Visual Basic and python tend to be my tools of choice. VB can automate things with excel, too. _________________ Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers. |
|
Back to top |
|
|
jkubatko
Joined: 05 Jan 2005 Posts: 702 Location: Columbus, OH
|
Posted: Sun Dec 28, 2008 9:38 pm Post subject: |
|
|
HoopStudies wrote: | I hate perl, but that is a personal preference. |
Perl rules. :-) _________________ Regards,
Justin Kubatko
Basketball-Reference.com |
|
Back to top |
|
|
Serhat Ugur (hoopseng)
Joined: 13 Oct 2006 Posts: 208 Location: Basketball Research
|
Posted: Mon Dec 29, 2008 4:04 pm Post subject: |
|
|
I used to use excel but I hired a freelancer from www.guru.com for my website.
It saved me a ton of time and I think it was cheap. In no way I see myself getting involved to programming. _________________ http://www.nbastuffer.com |
|
Back to top |
|
|
basketballvalue
Joined: 07 Mar 2006 Posts: 208
|
Posted: Mon Dec 29, 2008 4:33 pm Post subject: |
|
|
jkubatko wrote: | HoopStudies wrote: | I hate perl, but that is a personal preference. |
Perl rules. |
Yes, count me as pro-perl. I've written basketballvalue.com using perl and MySql (with R for generating the adjusted +/- results) to download and process the data, and then php to display the data on the web out of the MySql database.
Thanks,
Aaron _________________ www.basketballvalue.com
Follow on Twitter |
|
Back to top |
|
|
|