I agree that OP sounds like a beginner, and what you've suggested is likely the best approach for someone who is familiar with frontend tools and frameworks. Selenium (and admittedly BeautifulSoup) is probably too low level for this particular user, but that doesn't mean they can't still learn some fundamentals while solving this problem without resorting to something as heavy and complicated as background browser emulation and rendering. I could be wrong though.
aMockTie
I'm not currently on Discord, could you upload the code to pastebin or something similar?
I would love to see your code, but I understand if this forum isn't the most ideal place to share.
In my experience, this scenario typically means that there is some sort of API (very likely undocumented) that is being used on the backend. That requires a bit more investigation and testing with browser developer tools, the JS Console, and often trial and error. But once you overcome that (admittedly very complex and technical) hurdle, you can almost always get away with just using the requests library at that point.
I've had to do that kind of thing more times than I'd like to admit, but the juice is almost always worth the squeeze.
Selenium is really more of a testing framework for frontend developers, and could theoretically be used for scraping, but that would be somewhat like buying a car based on the paint and not looking in detail under the hood.
I can't say I've ever worked with scrappy, but the tool I would use for web scraping with Python is BeautifulSoup. This tutorial seems decent enough, but you will need to understand basic web concepts like IDs, classes, tags, and tag attributes to get the most out of the tutorial: https://geekpython.medium.com/web-scraping-in-python-using-beautifulsoup-3207c038723b
W3Schools will also be your friend if you have questions about HTML/CSS selectors in general: https://www.w3schools.com/html/default.asp
Understanding regular expressions and/or xpath would also be very helpful, but are probably best considered to be extra credit in most cases.
I'll try to respond if you have any issues or questions, but hopefully that gives you enough to get started.
Been a little while since I worked on ODBC stuff, but I have a couple of thoughts:
-
Would it be possible to use something like a table function on the DB side to simplify the query from the ODBC side?
-
I could be misremembering, but I feel like looping through individual inserts with an open connection was faster than trying to submit data in bulk when inserting that much data in one shot. Might be worth doing a benchmark in a test DB and table to confirm.
I know I was able to insert more than 50M rows in a manner of single digit hours, but unfortunately don't have access to that codebase anymore to double check the specifics.
Thank you for the detailed response!
I see now that you were responding to the title and not the content of the post.
To further round out my knowledge and understanding, could you please point me to additional information about Latinos moving further to the right, Harris' popularity, and polling data showing her well behind Trump in key states?
I'm also curious to see examples of the American left refusing to acknowledge the potential of a second Trump term.
Thanks again for helping me understand.
I'm not sure if I understand the points you are trying to make. Could you please elaborate? A) this meme is about records broken, not "wins." A president winning a second term would not break any records. B) why wouldn't Americans vote for a black woman? C) why would the thought of Americans electing a black women be "cult like?"
I'm sorry, I hope I didn't offend you.
I'm still not sure if I understand the intended joke. An average of 41 attempts per person sounds horrific. I'm sure there is something that's going over my head, it's some kind of dark humor, correct?
I sincerely hope that she has been trending in a positive direction. I'm glad to hear that her attempts have been unsuccessful, and that she has good love, support, and (hopefully helpful) medication in her life.
I imagine the knowledge of your sister struggling and suffering is hard on you too, and I wish the best for both of you.
Thank you, I agree that it's an important distinction to make. Having only been able to read the abstract of the linked article, do you perhaps have any information on the number of completed attempts compared to unsuccessful attempts?
I hope it goes without saying that even a single attempt is too many, and any completed attempts are devastating tragedies that don't reflect kindly on our current society.
100% this. Every website is different, though after doing this kind of thing for long enough, there are often common patterns and frameworks/libraries. Even general obfuscation can be reasonably reverse engineered with enough time and effort.