Photo by Phillip Glickman on Unsplash

Made a Robot that gets you the Lyrics to the current Top 40 songs.

Tari Yekorogha
5 min readApr 12, 2022

--

web scraping with python made fun.

We’ve all experienced this annoying survey:

Usually when you’re trying to get something done quickly on the internet. And it makes us wonder, “Do they really think a robot can’t click on a checkbox?”. Countless memes have been made around this question.

😂Not to be a bummer but the robot above is not the kind of robot the survey was made for. The survey was made for Web Crawlers. No, not the three Spidermen but programs made to automate the process of collecting data from the internet. These bots can’t pass the survey. At least some of them can’t 😉 (click here). Mine can’t and it doesn’t need to. Anyways, the surveys are there to stop some greedy bots that end up downloading the whole website which leads to the ruin of the website.

The robot was inspired by the current topic (Web Scraping) that I'm learning in my journey to become a Data Analyst, and by my love for music.

Description:

  • The robot is programmed in Python. The libraries used are selenium, BeautifulSoup, and requests.
  • The robot would go to https://top40weekly.com and get the top 40 songs then go to https://azlyrics.com and get the lyrics, then finally save the lyrics of each song as a text file in a folder on your Desktop.
  • The code and setup can be downloaded from my Github which I’ll leave in the reference

Making the Robot

Firstly, I loaded up the libraries that I would use to make the bot.

Then I proceeded to code the first task to be done by the bot. Which is

Getting the Top 40 list

To get the list I used the requests module to download the webpage. Then I parsed the page’s contents as a Beautiful Soup object.

Next, I inspected the webpage to figure out how to get the list. The inspection revealed that the list was stored under a <div> tag that belongs to the class “x-text” and the list was stored in a set of tens with four <p> tags

Exhibit A:

Exhibit B:

Exhibit C:

After trial and error, I got the specific the <div> tag the list is in and got all the <p> tags.

Knowing that the list is separated into tens with four <p> tags. I inspected one of the <p> tag’s text to figure out how I was going to format each <p>tag in order to get a python list of strings of each song accordingly. My inspection revealed that each song was separated with a new line (‘\n’)

Then I created a function to get a cleaned list of each of the ten songs kept in each <p>tag

Finally, under this task, I created a function to get the Top 40 songs list and store it in a global list called top40list

Onto the second task and final task of the bot.

Getting the lyrics of each song on the Top 40 list and saving them

First I made the folder/directory where all the lyrics would be stored using the python’s os.makedirs() function. Click here for a detailed explanation of how the function works

Secondly, I instantiated a WebDriver object named driver that started up a Firefox browser my browser. Note there’s a process for getting this to work. To know the process click here or here

Next, I inspected https://azlyrics.com to find out what my code needed for the bot to search for each song and download the lyrics. The inspection revealed that the opening page has a search box that has the name attribute “q”, and that each search result link is stored as a table’s data with a table data tag <td> that belongs to a specific class named “visitedlyr”.

Exhibit A:

Exhibit B:

Exhibit C:

Exhibit D:

Finally, with the knowledge acquired I made a for loop download of the lyrics of each song on the Top 40 song.

Conclusion

Finally, here’s an exhibit of the robot in action:

Robot in action

Summary

Built a web scraping bot using that would:

  1. Go to https://top40weekly.com and get the top 40 songs.
  2. Then go to https://azlyrics.com and get the lyrics.
  3. Then finally save the lyrics of each song as a text file in a folder on my laptop

Future Work and Limitations

  1. To get the lyrics the robot clicks on the first link that appears after it makes a search for the song. Sometimes the searched song doesn’t appear first e.g searching “STAY” by Kid Laroi and Justin Bieber returns “STAY WITH ME” by Sam Smith so it saves the lyrics of “STAY WITH ME”. This can be fixed with some manipulation of the code. You’re free to fix it. My code is open-source on Github
  2. The robot will always work unless modifications are made to either https://top40weekly.com or azlyrics.com

References

[1.] https://github.com/kingtroga/top_40_songs_scraper_bot

--

--

Tari Yekorogha

A 19-year-old Christian boy with a laptop and a dream to break into the tech world.