top of page

Battle of the Bots: Testing AI Song Recommendations

Writer: Kacie Julie Coahran ScottKacie Julie Coahran Scott

Updated: Feb 14

Grasping the Concepts

Artificial Intelligence (AI) seems everywhere these days---integrated into everyday apps you use, transforming the workplace and revolutionizing technology. A simple way to understand AI is that it simulates human intelligence in machines, allowing them to perform tasks like problem-solving, learning, and decision making. What makes that process easy? Machine Learning (ML) which is a subset of AI that is focused on developing algorithms that allow these systems to learn from data and improve overtime. The exciting aspect is that ML helps these AI tools make predictions, recognize patterns, and generate responses based on what they have been fed. Key being---to feed it with a volume of relevant high quality data.


Today, I’ll put different AI chatbots to the test by prompting each of them with the same question. I'm eager to see how they will respond, what insights they might provide, how they analyze genre, and the level of personalization each offers. This comparison will be a fun discovery of AI recommendation capabilities. Stay tuned for the results!


Before we dive-in, please answer the following question below:

Which AI Tool do you use?

  • 0%ChatGPT

  • 0%Google Gemini

  • 0%Claude

  • 0%DeepSeek

You can vote for more than one answer.


What Tools Were Used and Why?

  • ChatGPT -- The golden child of AI, boasting over 300 million weekly active users. But does popularity equal perfection? Let’s find out!

  • Google Gemini -- As a self-proclaimed Google enthusiast, I had to include this one. It’s a great storyteller—but does it have the right spirit?

  • Claude - Developed by Anthropic, it is known for its precise instruction-following skills. Time to put that reputation to the test—will it obey or go rogue?

  • DeepSeek - The AI world is buzzing about this newcomer. Let’s see how it handles a simple request. Is the hype real, or just smoke and mirrors?


What is the Purpose of Comparing these Tools?

Conversational AI (aka - chatbots) have become increasingly popular, by combining AI and ML to engage users in meaningful dialogue. Comparing these chatbots will help in 3 areas:

  1. Identify strengths and weaknesses in how each responds.

  2. Provide understanding into how different tools interpret and generate text.

  3. Lastly it can help users select the most suitable AI tool for their needs, whether it's being used for creative brainstorming, business assistance, or simply pure entertainment.


Let's Get Started: The Prompt Used

The song 'Take me Back to London' by Ed Sheeran & Stormzy is a favorite of mine. This song is catchy, I like that the voices have an accent, and it's a good mix of pop and grime music. I also enjoy the poetry rap style with singing. Can you please give me 3 song recommendations that have a similar feel?


Let's look at the Responses and compare them.....

  • Chat GPT Tone? Felt informal and had a neutral warmth.

  • Text? Organized in a bullet point format, easy to read, simple and straightforward.

  • Did popularity equal perfection (Previous Question)? It had a strong alignment to the original genre theme, emphasizing grime, afrobeat, and a mix of rap/sing. For me, ChatGPT did the best out of all the chatbots, recommended unique options from new artists and only had one recommendation where the artist in the original prompt was used. For future song requests I'll be using this conversational AI tool to steer me in the right direction!


  • Gemini Tone? Felt engaging and highly warm

  • Text? Wrote in an enthusiastic conversational manner with detailed descriptions about the songs aimed at connecting with me emotionally.

  • Did it have the right spirit (Previous Question)? Yes it was very high-spirited and had strong recommendations that fit the feel-good vibe of the original song. In regards to recommendations the third choice was unattainable or incorrect information. Gemini would have needed additional guidance to find an additional song recommendation.


  • Claude Tone? Felt friendly and had a thoughtful warmth.

  • Text? Organized in a number point format and had longer explanations that referenced my preferences as well as added context for each song recommendation.

  • Did it obey or go rogue (Previous Questions)? It did obey however I feel it definitely would need additional guidance to find better recommendations. Claude focused on using the same artist in the prompt and additionally gave incorrect information for the third recommendation.


  • DeepSeek Tone? Had a moderately warm welcoming tone.

  • Text? Organized by numbering and gave conversational details with concise descriptions.

  • Is the hype real, or just smoke and mirrors (Previous Question)? The choices were energetic and had a British Flair, however out of all the AI chatbots, I feel DeepSeek did poorly. It seemed too focus on using the same artist in the prompt and the third option was incorrect song title as well as genre. The first recommendation also was the incorrect genre choice. DeepSeek would have needed additional guidance to find more unique song options.


*Compiled Every Recommended Song - Giving Additional Insights!
*Compiled Every Recommended Song - Giving Additional Insights!

Key Takeaways:

✅ Most recommendations fit grime/UK rap genres.

  • Percentage of recommendations matching the genre? 83.33%


❌ AI over-relied on Stormzy & Ed Sheeran collaborations.

  • "Own It" – Stormzy ft. Ed Sheeran & Burna Boy - 3 out of 4 of the AI tools recommended this specific song.

  • Percentage of songs chosen with Ed Sheeran or Stormzy overall?


❌ Some recommendations had data errors or were difficult to verify.

  • "Don't Jealous Me" by NSG featuring J Hus - could not find this song on YouTube or Tidal under the artist profiles. A fan posted a version but the song featured Sneakbo.

  • "Feels" by Craig David ft. Kaytranada - could not find this song on YouTube or Tidal, however I did find a song by both artist called 'Got It Good'.

  • "Crown" by Dave featuring Stormzy - This song pulls up as 'Crown' by Stormzy, not the artist Santan Dave.

  • Percentage of incorrect information relayed from the prompt? 33.33% Error Rate.


❌ A few tracks were genre mismatches.

  • "Shape of You" by Ed Sheeran - is only a pop song, lacking a grime/rap feel.

  • "Feels" by Craig David ft. Kaytranada - is alternative indie or electronic, lacking a grime/rap feel.

  • Percentage of mismatching the genre? 16.67%


Possible Enhancements for Artificial Intelligence? 💡

These AI chatbots had an over-reliance on the popular artists used in the prompt.

--> Expanding on the Artist variety would give a chance for the user to find a new singer!

Unfortunately the AI chatbots have some issues with incorrect song data.

--> AI prompts could be taught to verify the information through different sources like YouTube, Google Music, Apple Music, or Spotify. It can be annoying for a user to be excited for a new song recommendation to only find it doesn't exist!

On the flip side, having these data errors is helpful for the user to choose the best chatbot for their needs - giving the user confidence in the AI tools accuracy.

--> Personally in the future I'll recommend ChatGPT for song requests and recommendations that steer you to new content!

DeepSeek had some genre filtering issues.

--> Fine-tuning the AI's understanding of the subgenres could really help the user find similar feelings in song requests.

On the flip side the other AI chatbot had strong genre alignment.

--> As music changes and artists expand outside their genre scope, continuing to feed genre understanding to help it have effective genre-based filtering.




 
 
 

Comentários


© 2035 by Urban Artist. Powered and secured by Wix

bottom of page