You can get pretty far using a bit of JS and Tamper Monkey . You can even search in existing user scripts if someone already did it.
Programming
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
Not an answer, but you don't need an extension to defeat right-click blocking scripts: shift-right-click usually does the trick.
Puppeteer and playwright were not mentioned yet
You didn't even describe how it's on the website.
I would use the webbrowser/Firefox save page functionality.
Or open the webbrowser dev tools and document.querySelectorAll('img')
and get the URLs from it and use those.
Or Page info media tab.
Or dev tools network tab. To identify and use the image web requests.
Or use Nushell with query module enabled, and http get query html.
Or my own C# until.
But I suspect there's Auth in play, so the only easy access is within the browser session?
Have a look at RobotFramework with the Selenium library. Anything you can manage manually, you can automate repetitively with Robot.
Also, have a look at the F12 Network tab, in case the real images are stored in a predictably named manner.
Before scraping I would verify that there is no HTTP API that you can use to craft requests instead of scraping from the website. These might be higher quality than what you can scrape. If there is no easy to use http API, go to scraping then. I would generally consider scraping the last option, unless it's a ridiculously easy website to scrape.
I'd probably use selenium. But that depends.