• TWeaK@lemm.ee
    link
    fedilink
    arrow-up
    2
    ·
    2 years ago

    Yeah I saw someone talking about the Pushshift method if you don’t have a GDPR request (which reddit have been stalling on for the last month anyway, so you won’t get it in time). The method was messy, you get this script from one github, edit the script to scrape out your comments, then edit another script to get it to work with your scraped links. That just seems like way too much hassle.

    Even shreddit has been a pain for me. It panics every so many comments (sometimes a few thousand, sometimes just 1) and then I have to find the comment it stopped at, delete all the comments up to that point (backing them up in another text file) then run the script again. I’ve been at it for ages now, I’ve got 96 comment files so far and still have 24,000 lines left (out of 75,000) in the main comment file. But I’m determined to get it done before the deadline.

    Then afterwards when reddit inevitably restores them I’ll have a record of everything being deleted, so hopefully I can get the ICO to give them a hefty GDPR fine for retaining and restoring my personal information.