Twins Analytics Infrastructure
Twins Video
This Twins have had a bit of a tortured history with analytics. In 2010 Rob Antony did an interview with TwinsDaily’s own Parker Hageman and revealed some interesting facts about the Twins and Sabermetrics. Antony stated this about their analytics department, “we're probably one of the last, if not the last, team to address it with a person dedicated solely to that.”. He went on further to fail to understand some fairly basic concepts about Sabermetrics. He thought FIP was “first strike in inning pitched” and was unable to guess about BABIP. He then revealed they had just hired their analytics guy and stated he would be “Gathering information and creating databases. This will be his first year. The guy that we brought in will start creating systems to build a foundation of our own that we can look at.” This is what I primarily want to get into as I have a background in IT.
In corporate America one of the techniques we use to understand what our competition is doing is to analyze their job postings. Have they posted an unusually large amount of Sales positions? Are they looking at specific geographic locations that have a concentration of talent? Are they asking for specific or unusual technical skills? These are all things we can look at to try to get an idea of intent and structure. I applied this technique to the Twins and their development job postings and found some interesting things.
One of the common details in both job postings is the fact that the Twins were looking for a developer who had experience doing front-end work (HTML, JavaScript), middle tier (.NET Framework, ASP.MVC), and the data layer (SQL Server). This implies a couple of things. The first is that the Twins are employing a standard three-tier architecture for their analytics.
It also implies that they only have “full stack” developers, which means they are required to know and to be able to develop in all 3 of their architecture tiers. This is problematic because is you are required to be able to code in everything that usually means you are unable to specialize or gain really in-depth knowledge on any single tier. For the Twins to take the next step in analytics I think they need to be hiring specialists in each of these areas.
Another thing I noticed is that the only data store they referred to is SQL Server. The reason that this is important is that the industry still values relational datamarts like SQL Server but they are also moving in the direction of unstructured Big Data repositories as well. Applications like Hadoop, HBASE, MongoDB, and many others allow unstructured data to be quickly stored and analyzed which allows for more experimentation by analysts when compared to a structured DB. I think the PITCH f/x and Trackman data has likely been analyzed enough but I think the next frontier is going into some less structured data. Putting medical records into a big data store and analyzing test results and notes to find patterns in identifying healthier players. Putting free text scouting reports into it and running natural language analytics on them using IBM Watson or some other AI service to identify key language or sentiments that indicate a player that is more likely to succeed. The addition of weather data and the analysis of its impact on specific players. I think there is a lot of room to grow here.
In short, I think it is likely this lack of specialization and not embracing the newer Big Data technologies led Thad Levine and Derek Falvey to go in a new direction this last fall with the analytics department. I wouldn’t be surprised if the hiring surge described in a recent article by Pat Reusse did not include hires to address these concerns. I am interested in your thoughts and feedback.
4 Comments
Recommended Comments