I recently listened to a podcast that talked about 1960s track titled “Amen, Brother” by The Winstons. It is the most-sampled track ever based on whosampled.com. Here is a link to a particular part of the song, the 6 second drum solo known as the Amen Break that is the reason for the 2800+ samples heard in Hip-Hop, EDM, the Futurama intro, etc. I wanted to see who were the type of artists sampling this song and how did they use the sample.
The first step was to try and scrape all the songs that sampled Amen, Brother. I tend to use Python for for web scraping and this is a bit how things went.
import urllib.request
from bs4 import BeautifulSoup
import re
import time
import json
def extract_song_info(song_tag):
takes in a tag containing song data & returns the artist, title, year, instrument sampled, and genre
re_year = re.compile(r'\d{4}')
# get the HTML from the page, then parse it using BeautifulSoup
url = "https://www.whosampled.com/The-Winstons/Amen,-Brother/sampled/"
resp = urllib.request.urlopen(url)
page = resp.read()
soup = BeautifulSoup(page, 'html.parser')
scraping = True
num_pages = 1
song_info = [] # will hold each song's data
while scraping:
# get all the tags that contained song data
list_of_songs = soup.findAll('div', {'class':'trackDetails'})
for song in list_of_songs:
song_info.append(extract_song_info(song))
# see if we can go to another page or stop
next_page = soup.find('span', {'class':'next'})
if next_page:
num_pages += 1
# get the link to the next page then open and parse it
next_url = "https://www.whosampled.com/" + next_page.a['href']
soup = BeautifulSoup(urllib.request.urlopen(next_url).read(), 'html.parser')
else:
scraping = False
print(f"Total number of pages scraped: {num_pages}")
print(f'Total number of songs: {len(song_info)}')
All together Amen, Brother was sampled 2864 times.
Now, I wanted to use Spotify’s web API to get audio data for each song that I scraped. The audio data I looked for includes Valence (measure of how positive a song sounds), danceability, tempo (beats per minute), and energy. This part was difficult as I could only search for the song using the song title and artist and then try to see if the results matched the title and artist I got from whosampled. This could lead to a couple of issues with naming
- Beyoncè vs Beyonce (accents in names)
- Diddy vs P.Diddy vs Puff Daddy (artists who changed names over time)
- Under Mi Sensi (Jungle Spliff) (X Project Remix) vs Under Mi Sensi (title naming conventions)
I didn’t want to be too bogged down with the matching issue so of course there is room for improvement. I managed to get audio data for about 38% of the songs. Booo but oh well. To do this I created an OAuth2 session to interact with Spotify’s API. I made a post about this called “Understanding OAuth”, which I hope you’ll view if you don’t know what OAuth means or if you were wondering how to do this sort of stuff in Python. I won’t show the code as it’s pretty ugly and takes away from the post. :)
Now I have songs and some audio data! Let’s get to ggploting!
glimpse(songs)
## Observations: 2,864
## Variables: 11
## $ Artist <chr> "N.W.A", "Skrillex", "Tyler, the Creator", "The P...
## $ Title <chr> "Straight Outta Compton", "Scary Monsters and Nic...
## $ Year <int> 1988, 2011, 2013, 1994, 2012, 1996, 1988, 1999, 1...
## $ Instrument <chr> "Drums", "Drums", "Drums", "Drums", "Multiple Ele...
## $ Genre <chr> "Hip-Hop / Rap / R&B", "Electronic / Dance", "Hip...
## $ ID <chr> "6KIKRz9eSTXdNsGUnomdtW", "73LjHkIsz1EcT9bn8z0cRU...
## $ Danceability <dbl> 0.833, 0.327, 0.441, 0.704, NA, NA, 0.751, 0.293,...
## $ Energy <dbl> 0.874, 0.957, 0.737, 0.949, NA, NA, 0.568, 0.997,...
## $ Valence <dbl> 0.4140, 0.4190, 0.4880, 0.2480, NA, NA, 0.6510, 0...
## $ Tempo <dbl> 102.866, 139.963, 78.462, 104.019, NA, NA, 119.30...
## $ Duration_MS <dbl> 258688, 233736, 254733, 402467, NA, NA, 106840, 2...
# breaks for x axis
year_breaks <- seq(1985, 2015, 5)
songs %>%
group_by(Year) %>%
count() %>%
ggplot(aes(x=Year, y=n)) +
geom_bar(stat='identity') +
scale_x_continuous(breaks=year_breaks) +
labs(subtitle="Samples of 'Amen, Brother' peaked mid 90s",
title="Number of songs that sampled over time",
y="Number of samples") +
theme_bw()
We can see sampling of Amen, Brother peaked around 1995 and had a couple of smaller resurgences in the mid 2000s and around 2012. Let’s see if this trend holds when breaking down by genre. We first look at the number of songs by genre that sampled from Amen, Brother.
songs %>%
group_by(Genre) %>%
count(sort = T)
## # A tibble: 10 x 2
## # Groups: Genre [10]
## Genre n
## <chr> <int>
## 1 Electronic / Dance 2564
## 2 Hip-Hop / Rap / R&B 149
## 3 Rock / Pop 66
## 4 Soundtrack 58
## 5 Other 17
## 6 World 3
## 7 Jazz / Blues 2
## 8 Reggae 2
## 9 Spoken Word 2
## 10 Easy Listening 1
To make things easier, let’s put every genre that has 5 or less counts into the Other
category then plot the sample frequency colored by genre.
songs <- songs %>%
mutate(Genre = fct_collapse(Genre,
Other = c("Other", "World",
"Jazz / Blues", "Reggae",
"Spoken Word", "Easy Listening")))
# reorder factors
songs <- songs %>% mutate(Genre = fct_rev(fct_infreq(.$Genre)))
# # Color scheme for each genre
color_scheme <- c("#F0E442", "#0072B2", "#CC79A7","#E69F00", "#999999")
# Songs that samples over time by genre
songs %>%
ggplot() + geom_area(aes(Year, fill=Genre),
stat = 'bin',
bins=30) +
scale_fill_manual(values=color_scheme) +
scale_x_continuous(breaks=year_breaks) +
labs(subtitle="Hip hop caught the wave first, followed by EDM, then others",
title="Songs that sampled over time by genre",
y="Number of samples") +
theme_bw()
So it seems that Hip-Hop started sampling in the mid to late 80s exclusively before Electronic / Dance (EDM) caught on in the early 90s. This makes a bit of sense, as 1985 introduced a cheaper sampler that producers could afford. Indeed, by 1990 only one EMD song sampled Amen, Brother (Ok Alright(CLUB MIX) by the Minutemen) while 23 Hip-Hop songs sampled. Such songs include Straight Outta Compton by N.W.A, I Desire by Salt-N-Pepa, Feel Alright Y’all by 2 Live Crew, and One Time Freestyle by Geto Boys. Excuse me while I take a short YouTube break to listen to them.
Now, why did EDM artists follow soon after? Based on this article, British music producers on the dance music scene looked to the US for inspiration in the early 90’s. Old breakbeats were dug out and the Amen break featured heavily in jungle music. From Wikipedia, “Jungle is a genre of electronic music derived from breakbeat hardcore that developed in England in the early 1990s as part of UK rave scenes. The style is characterized by fast tempos (150 to 200 bpm), breakbeats, dub reggae basslines, heavily syncopated percussive loops, samples, and synthesized effects… Producers create the drum patterns by cutting apart breakbeats, the most common of which being the Amen break”"
In 1997, Amen, Break went mainstream when Oasis used it in the song D’You Know What I Mean. It later appeared in David Bowie’s hit song Little Wonder.
Before I dive a bit deeper, let’s take a quick peek at the soundtracks that sampled to see if there are any familar ones.
## # A tibble: 58 x 1
## Title
## <chr>
## 1 Futurama Theme
## 2 Dirty City (Tokyo Stage)
## 3 Bomber Barbara
## 4 Mission 4
## 5 Bad Situation
## 6 Aridia - Traversing the Fortress
## 7 Sweet Soul Brother (B.B. Rights Mix)
## 8 Romulus 3
## 9 "Rival Battle: Metal Sonic \"Stardust Speedway\""
## 10 Battle! Kanto Champion
## # ... with 48 more rows
I thought it was nice to see soundtracks from Pokemon, Sonic, and Futurama there! Hope you also find something interesting and see if you can hear the sample in the soundtrack.
Ok, now I wanted to temporarily disregard every genre except EDM and Hip-Hop and look at samples over time.
songs %>%
filter(Genre %in% c("Hip-Hop / Rap / R&B", "Electronic / Dance")) %>%
ggplot() + geom_density(aes(Year, fill=Genre), alpha =.75) +
scale_fill_manual(values=color_scheme[4:5]) +
scale_x_continuous(breaks=year_breaks) +
labs(subtitle="EDM had more 'bursts' of interest in Amen, Brother",
title="Density of songs that sampled over time",
y="Count") +
theme_bw()
So it seems that there was a huge burst in sampling interest by Hip-Hop artists lasting until the late 90’s but it fell until a small resurrection in the late 00’s. I know the instrumentals in hip hop tend to change in style every 10-15 years or so so this could be a reason why. Also, hip Hop in the late 2000’s was a dark era plagued by ppl like Soulja Boy and their long shirts so maybe that’s why. I’m not too familar w/ EDM but it seems that the Amen Break remained pretty popular over time.
Now, let’s get into the audio features of the songs. Recall that I was only able to get audio data for 38% of the songs. So these graphs and extrapolations only apply to a fraction of songs that sampled Amen, Brother.
Here we look the distribution of song valence by genre. Spotify’s website says “the higher the value, the more positive mood for the song”. There is a dotplot added on top of the boxplot bc some of the genres had very few points and so it was worth showing the individual points as well.
Instrumental-wise, it seems Rock/Pop and EDM sounds slightly less positive incomparison to Hip-Hop, whose distribution seems to be spread almost evenly from 0 to 1.
These next few plots show that Spotify thinks these Hip-Hop songs are more danceable than the EDM songs which made me chuckle a bit. It might be because (from the EDM songs I know), EDM is kinda all over the place when it’s not building up for some sort of drop. It also isn’t super consistent rhythm-wise - I personally can’t dance to it at all.
We see, however, that the tempo for EDM is much higher than for Hip-Hop. This makes sense when you think about the Wikipedia description of Jungle music or listen to a random EDM song. Also, Rap in the 80s and 90s (when samples of Amen, Brother were highest) tend to be a big slower and funkier, so the graph for tempo and energy matches there as well.
It’s crazy how popular this song was. I hope you can find it in some of the songs you fondly remember. Sadly, the creators received no royalties despite it’s success in influencing music to this very day. Thankfully, some made a GoFundMe for the living band members and it made quite some money.
If you want to see the code I wrote to scrape the data and use Spotify’s web API, the repository can be found here!