I'm part of a small group of Jr Self Taught Web Developers who were recently brainstorming ideas for a Group Project App we could put together and actually create a user base.
I offered up the suggestion of a podcast application which would have the major feature of being akin to YouTube Sponsor Block, but specifically for podcast episodes.
Essentially, a user contributed database of timestamps for podcast episodes where the mention of cutting to sponsored ads or mentions of sponsorships would be marked so they could be edited out of the episode and then the user could also download said episode where ads are cut out of the final audio file.
My idea was shot down due to fears of possibly infringing on copyright and we ended up with going with another idea. I'm certainly not upset, and am actually excited with the project idea we did choose, but it did get me wondering about whether this idea actually could have legal implications.
I know specifically with YouTube there appears to be a sort of legal loophole that prevents Google from suing projects like invidious, yt-dlp, and YouTube Sponsor Block, but am unaware of the specific details as to how this works.
Thusly, I just wanted to ask if anyone has any insights into whether this project idea would incur any legal infractions from the likes of IheartRadio and other media platforms?
To be clear, I'm not seeking legal advice here, and I'll be taking any responses with a grain of salt, but I just wanted to see if anyone knows anything on this subject and the legal concerns raised.
I very much dislike being advertised to and podcasts are one of the last bastions of media where advertisements still come up regularly and I'd love to make this application for those who are frustrated with how often they have to skip through sponsor mentions.
Thanks in advance.
It likely won't work (well), because lots of podcasts actually use Megaphone and similar services that add interest-based ads into your download. I.e. ads can be of variable length or there may even be no ads, because the podcast targets the US but you're downloading from Pakistan.
Oh man, that answers some questions I've had for a while. Some of the podcasts I listened to have custom ad reads and then some will just blast the same ad as another unrelated one. Especially considering I get republican ads on podcasts with very liberal hosts. Plus gambling ads fucking everywhere
Not for anything I listen to, they just embed standard product ads in their talking
Have you downloaded your podcasts while I'm another country or with a VPN set in another country?
Antennapod doesnt embed ads. Mostly EU or close to EU countries
Not your app, but the server on which the podcast is hosted. They will see from which country you are trying to download it and sometimes insert different ads. But this mostly depends on the podcasts you are listening to.
Interesting, but this may be at defined timestamps right? So wouldnt change the core idea
They may be added at a defined time stamp, but if the ad length varies, then the timing would just be thrown off.
I know they get pretty local. I listen to a podcast from Canada that inserts ads for concerts in my home city in Ohio.
I wonder if those inserted ads could be detected. Antennapod also supports download, I wonder how that would work.
Also I wonder if such ads always need to have a given length, but maybe not.
It should be possible to detect non-ads by downloading different versions of the audio file and checking which sections are identical, but you'd need some way of detecting transitions between sections.
If the ads use a voice actor who doesn't talk on the podcast, maybe you could try to detect that.
This gets to the heart of the difficulty of this proposed project though. Thanks for going down this train of thought to all involved, very interesting. I had hoped to utilize a series of user contributed timestamps, but this would get more complicated depending on region, distributor, etc. This is a project I'll be thinking about long term though (and if I really think I have a solid plan, I'll seek legal advice last to ensure I have all my ducks in a row). Thanks for the advice.
This makes sense.
So mark the segment where an ad starts, then the segment when the show goes on. The client catches the audio snippet which can then be moved to autoskip ads.
I didn't really know. However they will probably have different lengths, so this might be a problem
I see. I think there might be an issue in redistribution to a certain extent. Some podcasts you can download directly from their website using RSS feeds and command line tools like wget. But a lot of those don't directly have sponsor mentions, but if they do, those are easily removed because they aren't injected at download time.
Others would require download using a service like Spotify, etc. And then editing the audio file and then redistributing it from a centralized data store, and that's where I believe the legal question would certain gain more validity
Rather than just providing the timestamps and running a script that removes those clips prior to download from another source (like how the sponsor block api can be queried to cut out sponsor menttions using a command line flag from yt-dlp prior to download), which I believe would fall into more of a legal grey area.
But yeah, injection of ads based off of location is one potential hiccup I had considered when thinking on the proposed app's implementation. Unless the ads are always loaded at a specific timestamp in the episode, this means that the length of the ads would be of varying length, making it less likely to work consistently, as you indicated.
So the only way would be to keep the audio files with the sponsor mentions removed in a centralized data store to be redistributed from, which I'm pretty sure isn't legal...not sure though.
Thanks for the insights!
Even if you're downloading the file directly from the URL found in the RSS feed, that doesn't mean that ads can't be dynamically injected into the file. A URL like https://download.my.podcast/episode4.mp3 can still be answered by a script that serves a custom version of the podcast with region specific ads.
You're right, I had forgotten about targeted ads, but you're right, that increases the length of the ad dynamically.
Ftr: I was talking about regular RSS feeds+MP3 downloads, not Spotify exclusives.
If you really wanted to do something about Spotify exclusives, the likely only way to do this legally is building a custom Spotify clientβSpotify allows custom clients, but only for paying customers, not for free users.
You definitely would have legal issues redistributing the ad-free version.
Sponsor block works partly because it simply automates something the user is already allowed to do - it's legally very safe. No modification or distribution of the source file is necessary, only some metadata.
It's an approach that works against the one-off sponsorships read by the actual performers, but isn't effective against ads dynamically inserted by the download server.
One option could be to crowdsource a database of signatures of audio ads, Shazam style. This could then be used by software controlled by the user (c.f. SB browser extension) to detect the ads and skip them, or have the software cut the ads out of files the user had legitimately downloaded, regardless of which podcast or where the ads appear.
Sponsorships by the actual content producers could then be handled in the same way as SB: check the podcast ID and total track length is right (to ensure no ads were missed) then flag and skip certain timestamps.
That is one of the more unique ideas presented thus far. The other similar approach would be utilizing a trained AI model that would recognize advertisements and sponsor mentions. I'm not exactly sure how Shazam works, but that might be something to research in figuring out how best to approach this. Thanks.
Yeah, I have no idea either, but it's been around for more than a decade so it should be fairly easy to find a library that duplicates it.
I would be wary of AI-based solutions. There's a risk of it picking up e.g. satirical/spoof sponsorships as actual ads, and perhaps not detecting unusual ads.
I'm slightly terrified of the day someone starts getting AI to reword and read out individual ads for each stream.
Perhaps that would be a good first step then. Figure out how Shazam works, then create a standalone application that catalogues and recognizes the audio of advertisements. An obvious name for such an app would be along the lines of "IsAnAd?". Then hook that standalone application up to a podcast aggregation client and use the timestamps of that to create the desired sponsor block functionality.
Thanks again. Just hashing this out with others like yourself has been super helpful.