Blaine Sheldon • over 1 year ago
Where can we access archival game footage?
With regard to the challenge to "Create a tool that extracts fundamental Statcast metrics (e.g., pitch speed, exit velocity) from archival game videos using computer vision.", is there a historic feed or archive we can reference?
Comments are closed.

5 comments
Michelle Brain Manager • over 1 year ago
Hello Blaine,
Great question! One way to access audio/video files is through the "2024-mlb-homeruns.csv" file on GitHub that has a link to an .mp4 video file for each home run in that dataset. We are working on other options as well and will surface those when confirmed.
Links:
https://github.com/MajorLeagueBaseball/google-cloud-mlb-hackathon/blob/main/datasets/2024-mlb-homeruns.csv
Best of luck with your project!
Michelle Brain Manager • over 1 year ago
Hi Blaine, Thanks again for your interest in the Google Cloud x MLB™ Hackathon! I wanted to follow up on your question about how to access audio/video files as we have just confirmed another option. We have just received an update that it is OK to use public YouTube videos (of MLB games or highlights) for the purpose of this hackathon. Gemini has the ability to analyze YouTube videos directly which would help with this - the team has some demo notebooks for this here: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/video-analysis
Best of luck with your project!
T I • over 1 year ago
Are there mp4 files available for pitching? The statcast website provides videos but not files for download
Michelle Brain Manager • over 1 year ago
Great question! I've asked the team and hope to hear back soon
Michelle Brain Manager • over 1 year ago
Hello, I heard back from the team and other than the HR videos provided, there aren't any .mp4 files directly from MLB. You can use Stats API data to find equivalent video on Film Room (see example: https://colab.sandbox.google.com/drive/1QcZD-_VK-Fa9ZC_iNy6Cth0n67KF2dSC?usp=sharing#scrollTo=PF8OmaDKunjM), and then scrape the video files off Film Room.
The other option would be to use public YouTube videos (of MLB games or highlights) for a similar purpose (though this might be less structured). Gemini has the ability to analyze YouTube videos directly, which would help with this - demo notebooks for this are here: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/video-analysis
Best of luck with your project!