/r/PhishData
/r/PhishData is for sharing analysis of the decades of Phish data, creating visualizations of that data, and trading tips, techniques, and ideas for future work in the area.
/r/PhishData is the subreddit for analysis and visualization of data related to the band Phish. Phish has played hundreds of shows since the early 80s, every one of which has a unique setlist. Many of their songs contain long freeform improvisation and are never played the same way twice.
Setlist information and fan-made recordings of shows are freely available online, giving use a huge wealth of data to analyze!
Phish-adjacent content is okay.
/r/PhishData
All of our track info is available as a JSON file if anyone wants to include it in any of their data analysis. Should include song, date, location, duration, and tour.
Hit me up either here or on Twitter @phishjustjams if you're interested.
Another part of my project to use Phish data had me thinking about how to run through all shows to see which shows have the most overlap in songs. This is what I found:
Most Similar Phish Shows Overall:
Since these were all very early in Phish's career (not surprising, fewer songs to choose from = more overlap) I decided to do the same for shows within 2.0 and 3.0 too.
Most Similar 2.0 Shows:
Most Similar 3.0 Shows:
My program ran this comparison for all shows against each other and I came up a way to determine the most "typical" show by taking each show, taking it's top 5 matches for similar shows, then averaging the percent overlap that the show has with each of the top 5. Probably not the most accurate way to do it, but whatever.
Most Typical Phish Shows Overall:
Most Typical 2.0 Shows:
Most Typical 3.0 Shows:
I was inspired by Mayacelium again and come up with an idea to rank how shows align with the song choices for the 10 Dinner and a Movie shows. Essentially, I wrote a data query that went through the tracks of every show and give a point for every time that a song in the show was played in a DaaM stream. (Tweezer has been on 5 streams so far so any show with Tweezer gets 5 points, and gets one point for any Stash which has only been in one stream) Like Maya, I also only counted shows in 97 and later.
Least Dinner and a Movie-ish Shows:
Most Dinner and a Movie-ish Shows:
Most Dinner and a Movie-ish Dinner and a Movie Shows:
% - 3 set shows
Not too surprising that shows with more tracks (especially 3 set shows) had higher scores and shows with fewer tracks (Baker's Dozen and late 90s shows) had lower scores.
I’ve been doing the 2.0 #jamBracket by @WeekendWook on Twitter and the 3.0 Jam of the Era bracket on phish.net and have been curious (as an exercise) what the community would come up with as a point system for jams. Here’s what I’ve got as a basis...
1 point - every 30 seconds of the jam
1 point - every minute of a slow build (5/22/00 Ghost)
1 point - every tease
2 points - every quote
2 points - every stop/start
3 points - every time theme of jam changes
3 points - every time someone besides Trey causes the theme change
3 points - every peak
5 points - unique theme (e.g. IT waves or Mohegan BASOS)
What else? Extra points if the jam sounds like eno (12/30/19 tweezer @ 20 mins)?
Note: I understand some won’t want to quantify, but am curious.
And top three songs with the most day of week bias: Sand, Light, Backwards Down the Number Line
Do we never miss a Sunday show because it is more likely that they will play Back on the Train than any other song?
I did a little data analysis project to see if certain songs were more or less likely to be played on certain days of the week. Results are here: https://jroefive.github.io/2020/04/30/Day-Of-Week-Bias-In-Phish-Setlists.html
I'm planning to do a bunch of these and my ideas are here: https://jroefive.github.io/phish/shakedown.html
I'm open to requests or suggestions on what to tackle next!
Anyone have experience using the phish.in api along with the Python Requests library? I have a valid api key but keep getting an invalid api key error. It's probably something simple but some pointers to working code or something might be nice.
What I have now is this:
url = 'https://phish.in/api/v1'
headers = {'Accept':'application/json' , 'Authorization':str('Bearer '+ apikey)}
r = requests.get(url+'/years?include_show_counts=true',headers=headers)