How UIPath helped me migrate my church's sermons to Spotify
Part 1 - What to do when there's no API
I had no idea what I was in for!
A friend and former member of the church I attend was leaving and needed someone to take over management of the church’s website. It’s a small, rural church and I can count the number of tech folks on one hand with fingers to spare, so I was an obvious candidate. Because I’ve never done much with web site development and because lately, I’ve been wanting to learn more about it, I jumped at the chance.
My friend and I met and talked through the details. He felt that the site needed to migrate off its current provider/platform to something new. And in the process, a couple of key areas of the site needed a bit of an overhaul to improve the experience for the church members responsible for making updates.
The site is quite static with only two commonly updated parts - the member directory and the sermon recordings. The member directory is quite painful to maintain based on the way it’s implemented, but it only gets updated once every 3-6 months. I’ve created a new directory page and maintenance approach that’s easier to use and manage, but I’m still not content with it. That project will probably turn into another post in the future. And the sermon recording page wasn’t too hard to update but moving it to a new home was more challenging, interesting and educational than I had expected.
A new host for the recordings
During the process of evaluating new website platforms, I learned that most of them don’t offer much storage space with their free or affordable pricing plans. That meant I needed to find an alternative host for the 15GB and growing set of sermon recordings.
I also decided that I wanted more than just links on a website for folks to listen to the sermons, I wanted a podcast and all the benefits they provide. And I wanted to find a podcast site that is easy to use for the non-tech church members to complete the weekly task of uploading a new sermon.
After quite a bit of research on podcast hosts, I settled on Spotify as the best match for my requirements. It offers free podcast hosting, though I have yet to upload the full back catalog of sermons and verify this, and I have no doubt others in the church will have no problem uploading sermons. Plus, our new website platform has a plugin that will nicely embed a Spotify podcast into the site.
At this point I manually migrated a dozen sermons to Spotify so I could show the site to the elders and get some feedback before using automating the rest of the migration.
Extracting and cleaning the data
After reviewing the site with the elders and getting the go-ahead to complete the migration, I got down to the business of migrating the sermon recordings.
Since this is a one-time process, I took a very pragmatic approach of automating the tedious and repetitive parts, like downloading 300 recordings from the previous host, and glued them together with a bits of manual effort.
The previous host let me “export the site” which gave me a very large xml file containing the contents of all the pages, but that export did not include the 15GB of MP3 files for the sermons. Thankfully, the xml file did contain information about the MP3 files so it was fairly easy to write a Python script to parse the file and create a csv file with the 3 relevant pieces of information I could find: title, URL and date, which came from the file name of the MP3.
I opened the csv in Excel and realized that the titles were anything but consistent. Having been uploaded by multiple people over the course of several years there was a variety of formats where some included the date and others didn’t, nearly all mentioned the preacher, but the various preachers had their names spelled in a variety of ways, etc. Anyway, after a couple of hours in Excel extracting information out of the title and then putting it all back together again in a new, consistent title field I was ready to start uploading the recordings.
After that, I wrote a Python script to read the csv file and download each of the mp3 files to a folder on my computer. Now, let’s start uploading to Spotify so I can wrap this up!
What? Spotify has no API for podcasters?!?!
I’ll freely admit it’s my fault for not checking more thoroughly, but did you know Spotify does not have an API for podcasters to upload their recordings and manage their podcasts?! Doesn’t every big platform like Spotify have APIs for the most common interactions?
I found Spotify’s API page and poured through it looking for a way to upload to a podcast, but I found nothing. Then I searched the web for longer than I would like to admit, to see if someone else had documented an approach I could build on - strike two. At this point I decided to have another look at podcast hosts, but either the cost was too high or the amount of data I could upload each month was way too low and it would take me months and months to upload the back catalog of recordings.
Deciding again that Spotify was the best option, I figured I could use a library like Selenium to automate clicking and entering data in the Spotify website. I had only briefly used Selenium before and completely underestimated the effort required to find the elements I needed to interact with in the rats’ nest of html. I spent a couple of hours before deciding I needed a different tool.
UIPath to the rescue. In hindsight, perhaps I could have used Selenium IDE, but I didn’t realize it existed until after I had used UIPath, and the website makes me think Selenium IDE is more focused automating tests than solving my data entry problem.
UIPath to my rescue
Like many tools I end up using these days, I had never used UIPath before. I would call UIPath a low-code tool for automating GUIs, but that simple explanation belies UIPath’s advanced capabilities, which I feel I have barely scratched the surface of. But I found I was able to do something basic like click a button on a web page without watching a video or reading instructions, so it seemed a good tool to start with. Plus, it has a free plan that worked well for my project, so it cost me nothing.
I did end up having to read a couple of docs and watch a couple of videos as I got into building my Sequence (think low-code script or flow) in UIPath, but when I look at all the steps in my sequence now, I’m shocked at how much I was able to accomplish so quickly and intuitively.
Here’s a pseudocode description of the steps I’m doing in UIPath, and there is a very long screenshot that follows.
Read recordings csv file I produced earlier containing the Title, Date and local path to the MP3 file for recording
Open the browser to the Spotify for Podcasters site (no need to authenticate as my login is cached), then for each recording do the following
Click the “New Episode” button
Fill in the local file path in the File Open dialog so the MP3 can be uploaded - very easy!
While the file is uploading
Fill in the “Title” - also super easy!
Fill in the Publish date
[Are you kidding me?! There is no way to just type the date?!]
Ok, create a loop to click the left arrow until the year in the UI is not larger than the year of the recording’s date
Now that I have the right year, choose the correct month with a case statement and a different element to click based on the current month (i.e. click “Jan” for January). 12 different places to click isn’t that bad, but at this point I’m getting concerned about the time, effort and potential for error if I have to do the same to select the day of the month…
Time to set up a case statement for the different days of the month… but wait, days of the month don’t stay in the same place on the calendar month after month. What kind of logic is it going to require to point to the correct location? Some kind of pattern that increments every 7 years? What about leap years? I really want to move on to the next part of this project… am I still going to be working on it during the next leap year?
No! Thankfully it was shockingly easy in UIPath’s UI targeting tool to find the element that had an InnerText matching the Day by replacing “29” in the screenshot below with my variable name “dateDay”
so now it looks like this in the Properties tab:
At this point, all I had to do to wrap up was confirm the date, wait for the MP3 file to finish uploading, which caused a button called “Schedule episode” to become active, then have UIPath click that button.
There is a whole lot more to learn about UIPath, but with a very reasonable amount of effort I was able to accomplish a task that would have taken me much, much longer if I had stuck with Selenium. I’m looking forward to using it again, and I’m thrilled to have a great new tool in my toolbox!
But what about that empty field?
Perhaps you noticed in the “Episode options” dialog above that there was one field I didn’t mention - Episode description:
Me: Wouldn’t it be nice to provide a brief description of each episode, so a listener gets more information than just what’s in the title?
Me: Definitely! I listen to a lot of podcasts and the descriptions let me search for topics I’m particularly interested in.
Me: But the original site didn’t contain a description so where am I supposed to get it? I don’t have time to listen to nearly 300 episodes and create my own summary.
Me: Hmmm… good question… what about the speakers? Could they provide the summaries?
Me: I’m sure they don’t have time, and some of those sermon recordings go back 6 years, so they would likely need to re-listen to their sermon to provide a summary.
Me: Ewww! Who likes listening to themself speak?
Me: Not me! I’d rather listen to a teacher’s nails on the chalkboard!
Me: Can’t AI create summaries?
Me: I think so…
Stay tuned
Ok - I had originally planned to include everything in one post, but this one is already quite long. You can read the second post at Affordable Data as soon as it’s published.