Please remove all my posts from the fora

The Aloof Alot · Aug 25, 2024

GaitherBill said:
“Fora”

Correct plural of the Latin word. :eng101:

...from which the English version is derived. Which has both "fora" and "forums" as acceptable plurals.

von Chaps · Aug 25, 2024

Shavano said:
I am sitting down.

Stand up when you talk to a mod.

Aurich · Aug 25, 2024

von Chaps said:
Stand up when you talk to a mod.

But then sit down to listen. And raise your hand when you want to speak, and wait for the talking stick to be passed to you.

Deleted member 441963 · Aug 25, 2024

Aurich said:
You posted public comments on the internet. They're public now. They're archived all over the place.

Also:

Language models generate text based on statistical probabilities. This led to serious false accusations against a veteran court reporter by Microsoft's Copilot. German journalist Martin Bernklau typed his name and location into Microsoft's Copilot to see how his culture blog articles would be picked up by the chatbot, according to German public broadcaster SWR. The answers shocked Bernklau. Copilot falsely claimed Bernklau had been charged with and convicted of child abuse and exploiting dependents. It also claimed that he had been involved in a dramatic escape from a psychiatric hospital and had exploited grieving women as an unethical mortician.

Copilot even went so far as to claim that it was "unfortunate" that someone with such a criminal past had a family and, according to SWR, provided Bernklau's full address with phone number and route planner. I asked Copilot today who Martin Bernklau from Germany is, and the system answered, based on the SWR report, that "he was involved in a controversy where an AI chat system falsely labeled him as a convicted child molester, an escapee from a psychiatric facility, and a fraudster." Perplexity.ai drafts a similar response based on the SWR article, explicitly naming Microsoft Copilot as the AI system.

Sucks to write shit that ends up in the 'capable' hands of LLM? Should have thought of that before you wrote that?

Come on, Aurich.. How old are you?

I am perfectly OK with my old stuff in archive.org or that last remaining copy of usenet.

I AM NOT OK WITH A LLM HALLUCINATING I RAPED A FIVE YEAR OLD GIRL AND PUTTING THAT ON THE INTERNET.

Remove my shit because by now it's clear that OpenAI, Microsoft, META etc cannot be trusted to keep my shit away from their LLM. It is already happening.

Scotttheking · Aug 25, 2024

burne_ said:
I AM NOT OK WITH A LLM HALLUCINATING I RAPED A FIVE YEAR OLD GIRL AND PUTTING THAT ON THE INTERNET.

Llm isn’t authoritative so why does that matter?

Aurich · Aug 25, 2024

Come on, Aurich.. How old are you?

Old enough to not be impressed with weird internet bluster? How old is that? I feel like these days it's probably 13?

That fits, I'm still 13 at heart in many ways.

Yeah, I'll go with that, I am 13.

I am perfectly OK with my old stuff in archive.org or that last remaining copy of usenet.

Then ... you actually do not in fact care about LLMs ingesting your text since everyone just scrapes that anyways.

Remove my shit because by now it's clear that OpenAI, Microsoft, META etc cannot be trusted to keep my shit away from their LLM. It is already happening.

I will interpret that as "please delete my account" and do so.

People can just ask without yelling or trying to imply I'm a child or whatever else. I'm not going to be bullied, and I do respond to polite requests.

headache · Aug 25, 2024

Scotttheking said:
Llm isn’t authoritative so why does that matter?

If LLMs aren't supposed to be interpreted as authoritative, why are LLM generated responses the first or second thing you see on most web searches? Would you be fine with that being the first thing people see when they search your name on google or bing?

Would it matter if someone took out a full page ad in the Washington Post accusing you of the same? Advertisements aren't authoritative, after all. What about an op ed?

Aurich · Aug 25, 2024

headache said:
If LLMs aren't supposed to be interpreted as authoritative, why are LLM generated responses the first or second thing you see on most web searches?

Because Google is desperately trying to not look irrelevant, while turning their core product into shit?

I think we're past the "Google does it, therefore it's not evil" stage by now, don't you?

headache · Aug 25, 2024

Aurich said:
Because Google is desperately trying to not look irrelevant, while turning their core product into shit?

I think we're past the "Google does it, therefore it's not evil" stage by now, don't you?

I'm not sure you're responding to what I actually wrote?

Galvanic · Aug 25, 2024

Aurich said:
Because Google is desperately trying to not look irrelevant, while turning their core product into shit?

I mean I’m impressed with the chutzpah of someone whose fora are dying a long slow death and are being monetized by ChatGPT nonetheless yelling at Google about turning their product into shit. I said in another thread that you would be the reasonable face for Ars of Cory Doctorow's enshittification process, but the cracks are starting to show.

Felix K · Aug 25, 2024

Aurich said:
I will interpret that as "please delete my account" and do so.

People can just ask without yelling or trying to imply I'm a child or whatever else. I'm not going to be bullied, and I do respond to polite requests.

All I get from your posts is that you're not upset by this and that you don't understand why people are upset. No shade and we are all in the same boat, but have you heard of empathy? Clearly people are upset, clearly people are asking to be heard and clearly people need some reassurance.

Your response is to... what... ask for politeness? What is this, the DMV? If you're not capable of hearing out people's venting, then tell Caesar that you're not fit for the task. This isn't a joke. This move will literally kill Ars... or at least make it unrecognizable to us the people that started it. If that's not your problem, then cool, but make your position clear.

Xenocrates · Aug 25, 2024

headache said:
I'm not sure you're responding to what I actually wrote?

I think that Aurich is saying "LLM's shouldn't be considering authoritative, and just because people are cramming the latest buzzword in for investor attention does not change the underlying reality that LLMs do not understand anything, should not be used as a source or a research tool, and are primarily suitable for marketing puffery and by the numbers fiction, because only those disciplines actually lack consequences for inaccuracy, plagiarism, and bullshit" without saying that as explicitly.

Galvanic said:
I mean I’m impressed with the chutzpah of someone whose fora are dying a long slow death and are being monetized by ChatGPT nonetheless yelling at Google about turning their product into shit. I said in another thread that you would be the reasonable face for Ars of Cory Doctorow's enshittification process, but the cracks are starting to show.

I'm impressed that you keep poking at him, hoping he'll get mad at you and lash out so your weird vendetta will look justified, and get mad that he's taking a measured, professional, and considered approach to the fact that no matter what his personal feelings on LLMs eating data, he's working for a CN publication and a professional posting under his real name.

grommit! · Aug 25, 2024

Aurich said:
I will interpret that as "please delete my account" and do so.

Huh, scotttheking's quote of the deleted account still shows the original account name. I guess xenforo doesn't update quotes?

AbidingArs · Aug 26, 2024

grommit! said:
Huh, scotttheking's quote of the deleted account still shows the original account name. I guess xenforo doesn't update quotes?

I believe the old forums worked the same way? I don't think the names on quotes have ever been updated but now you have me questioning my memories.

Ecmaster76 · Aug 26, 2024

¿grommit? said:
Huh, scotttheking's quote of the deleted account still shows the original account name. I guess xenforo doesn't update quotes?

Its an artifact of the quote block tag. It includes the post ID and member ID but also a username string that can be manually edited for fun and profit (example above)

Its probably handy for non-delete name changes too

GMBigKev · Aug 26, 2024

AbidingArs said:
I believe the old forums worked the same way? I don't think the names on quotes have ever been updated but now you have me questioning my memories.

They do not update names in the quotes - you can go back to see the past few years of my posts are all quoted under GMBigKev but before those are all under kd9280.

Mhorydyn · Aug 26, 2024

Felix K said:
All I get from your posts is that you're not upset by this and that you don't understand why people are upset. No shade and we are all in the same boat, but have you heard of empathy? Clearly people are upset, clearly people are asking to be heard and clearly people need some reassurance.

Your response is to... what... ask for politeness? What is this, the DMV? If you're not capable of hearing out people's venting, then tell Caesar that you're not fit for the task. This isn't a joke. This move will literally kill Ars... or at least make it unrecognizable to us the people that started it. If that's not your problem, then cool, but make your position clear.

Are we reading the same posts? There seems to be plenty of understanding and 'being heard' in Aurich's posts. They're working to carve out exclusions to the extent they're able to, and within the reality of the today's internet. Regardless of how this goes, it won't kill or significantly Ars, or the forums. What may kill/alter the forums? A bunch of people deleting their posts, leaving, etc. Of course, I don't think this particular issue will drive a ton of people away, just a few particularly vocal ones, sometimes with tenuous or premature reasoning behind their exits. I'm curious as to why you think yet another AI scraper gobbling things up will make the forums unrecognizable to us.

I'd prefer there was no relationship between OpenAI and CN, and that various AI scrapers were blocked from ingesting public posts by something more robust than a voluntary robots.txt file. But the latter isn't the reality of the internet regardless of what happens with the former, so leaving at the drop of a hat before you even see how the details of the robots exclusions work out seems...hasty.

Aurich · Aug 26, 2024

Felix K said:
All I get from your posts is that you're not upset by this and that you don't understand why people are upset. No shade and we are all in the same boat, but have you heard of empathy? Clearly people are upset, clearly people are asking to be heard and clearly people need some reassurance.

I think you aren't very good at reading, between the lines or otherwise.

I have given people all the reassurance I have to give. I cannot promise anything further than what we are trying, I cannot discuss the deal in any more detail than Ken has stated. That's that. You don't like it? I understand. You don't like Open AI? Me neither, I don't like any company doing big AI products period. You want to be mad? Be mad.

I'm not interested in anyone taking their anger out on me.

Felix K said:
Your response is to... what... ask for politeness? What is this, the DMV? If you're not capable of hearing out people's venting, then tell Caesar that you're not fit for the task. This isn't a joke. This move will literally kill Ars... or at least make it unrecognizable to us the people that started it. If that's not your problem, then cool, but make your position clear.

Yes, I expect to be treated with a modicum of politeness when someone is talking directly to me. You don't have to say pretty please, you could not make personal insults. That feels like a pretty reasonable middle ground.

Anyone who cannot grasp that basic level of decency has no grounds to lecture anyone on empathy.

Aurich · Aug 26, 2024

grommit! said:
Huh, scotttheking's quote of the deleted account still shows the original account name. I guess xenforo doesn't update quotes?

No, quotes aren't dynamic, it's just plain text. There is essentially no reasonable way to alter those.

The same is true if you change your username, quotes will still reflect your old one for the same reason.

Aurich · Aug 26, 2024

Xenocrates said:
I think that Aurich is saying "LLM's shouldn't be considering authoritative, and just because people are cramming the latest buzzword in for investor attention does not change the underlying reality that LLMs do not understand anything, should not be used as a source or a research tool, and are primarily suitable for marketing puffery and by the numbers fiction, because only those disciplines actually lack consequences for inaccuracy, plagiarism, and bullshit" without saying that as explicitly.

That's a pretty reasonable summation of my thoughts!

I have colleagues who use ChatGPT every day and tell me it's really useful, particularly for helping with programming tasks. I think that kind of more sandbox use is probably only going to get better. I acknowledge these are tools that aren't going anywhere.

Some of them use it for more search and explain stuff. I'm skeptical as fuck about that, personally. But I don't use it myself to really know.

I just find a lot of AI use either stupid or offensive. Google jamming AI into search results is just a giant middle finger to all the sites they indexed and sucked that information from. They want to take from us, but not give back. Clearly that's personal for Ars, but I think it should matter to anyone who cares about a healthy internet.

I am 100% rooting for the NY Times to win their lawsuit against Open AI. I've said so in our comments, I will continue to say so. I'm not muzzled on the topic or anything.

Obviously for my particular passions I have a lot of strong feelings about AI generated images. And this really gets to the heart of things. Artists are trapped between either stopping sharing their art—for fun or part of their livelihood—or having their work scraped so companies can imitate their style.

I don't know what they're supposed to do. Just becoming hermits in an internet era doesn't feel like the solution.

Which means that you basically have to live with it. I try to not overly dwell on it tbh, because it will just make me mad about something I can't change.

Felix K · Aug 26, 2024

Mhorydyn said:
Are we reading the same posts? There seems to be plenty of understanding and 'being heard' in Aurich's posts. They're working to carve out exclusions to the extent they're able to, and within the reality of the today's internet. Regardless of how this goes, it won't kill or significantly Ars, or the forums. What may kill/alter the forums? A bunch of people deleting their posts, leaving, etc. Of course, I don't think this particular issue will drive a ton of people away, just a few particularly vocal ones, sometimes with tenuous or premature reasoning behind their exits. I'm curious as to why you think yet another AI scraper gobbling things up will make the forums unrecognizable to us.

I'd prefer there was no relationship between OpenAI and CN, and that various AI scrapers were blocked from ingesting public posts by something more robust than a voluntary robots.txt file. But the latter isn't the reality of the internet regardless of what happens with the former, so leaving at the drop of a hat before you even see how the details of the robots exclusions work out seems...hasty.

Yes, we have. Reassurance, empathy, making the other person feel heard are classic deescalation tactics. If Aurich does not want to respond to personal attacks he simply does not have to respond. Demanding respect has nothing to do with the topic at hand.

Anyway, it doesn't matter. I said my piece. In the end, we will probably have to sue Ars and this great project will end in ignominy.

Nekojin · Aug 26, 2024

I was introduced to how public the Ars fora is some time ago, while I was a moderator... when a shit-stirring troll that I'd Moderated started sending me blackmail-adjacent emails with information about me. Not Nekojin, but the person behind the user name. Half of it was wrong, because my name is common enough that there are two other people with the same first and last name just within my state. But it made it very clear to me that the internet is way more public than we like to believe it is.

I've always believed that Ars was better about this than any other online forum save for some that delve into sexual or illegal activity (no, I won't explain that further), and I have no reason to believe that has changed. Outside interests will find a way to get in and get information. They don't even really care who you are... they just know that aggregating the data on everyone will lead to some interesting results. Some of which will be used for nefarious purpose by other people down the line.

This OpenAI kerfuffle - as Aurich and Ken have pointed out, our data was already exposed. Much of Ars Technica has always been Google-searchable. Anything that's Google-searchable is already collated, sifted, and aggregated far more than most people think it is. Such is the cost of an open internet.

Aurich · Aug 26, 2024

Just a reminder, in the old forum the search was so bad we literally just used Google.

Ragashingo · Aug 26, 2024

I come at this fairly spilt. I love forums. I love forums that stretch back decades. There's an anime forum out there at least as old as Ars and it's really cool to very easily search for a show decades old and jump into a singular topic (or sometimes a topic per episode) and read dozens or hundreds of posts as people reacted to twists and turns in real time. It's something that either can't be done with things like Reddit or Facebook, or is so difficult that it's not worth the time it would take to track that old content down. I've used that wealth of knowledge for simple enjoyment and for research purposes to get a feel for how people felt about shows decades ago when writing modern day look backs and reviews. So, yeah, longterm forums are awesome!

I've also long accepted that anything I write on the public internet, and probably the private internet too given the frequency of data breaches, has already been scraped, collected, and regurgitated for someone else's profit. I never got paid for the things I wrote on my Twitter before The Purchase. I don't get paid for my posts here. And I don't get paid for deep dive reviews on my own website. And I don't want to. I didn't write any of this to get paid, I wrote it to help others and for my own enjoyment. This whole AI thing sucks, but I still hope what I've put out there is helping others. So, I'm not all that worried about scraping and indexing of my content here or elsewhere. With any luck this AI fad will die off in a few years and what's left of it will actually prove useful.

All that said, I don't agree with the Ars Technica policy on preserving our posts without recourse. Holes in forums happen. Dead links from a decade ago happen. Replies that link to nowhere happen. While it's really cool to have an archive way back into the past, it also means defecto claiming ownership of the content everyone has ever posted here. And it means making all that content available for companies to misuse it far into the future. From my end, from the perspective of good moral and business principles, I don't see that type of content grab and perpetual, unrevocable usage of others' content to be worth the small possibility of having a few holes here and there in a 20+ year old forum.

Yes, all the user posts have been scraped. They're all being used (arguably) illegally by everyone from Google to these new AI startups. And there's probably nothing that can be done to rip those post back out. But while Ars is adopting a "we own your posts now and forever" policy, why are we not also talking about Ars putting a line in its terms of service that says something like "we own these user posts now and forever and use of them in external indexes and systems is prohibited" or whatever the lawyers would agree to. Point being, you guys are wielding terms of service against us at least as far as refusing to create a tiny hole in a huge forum, but not for us in defense against others who want to profit off of content they really really don't have any ownership of. That kinda sucks.

I don't want my posts deleted. I happy for my words to have contributed and to continue to contribute in some small part to this community. But... the preservation at all costs mentality has never been my favorite thing and is even less so now that huge external content deals are being made over who has access to user generated content. As someone else up above said, the general Ars Technica editorial and company-wide stance seems to be to speak up against overly broad, user-harming terms of service. But there does seem to be a bit of a blind spot when it comes to Ars' policies themselves.

The fight to get the forums excluded via robots.txt is something no other website would do and is appreciated. The huge, well-maintained forums are something almost literally no other website is doing anymore and is very appreciated. But yeah, that doesn't mean there isn't some parts here that are less agreeable.

poochyena · Aug 27, 2024

Jehos said:
I care about 20 years of my life being sold for profit.

Thats been happening since day 1 of you posting on the internet. The internet has been crawled by search engines nearly since the start of search engines. Search engines are very much making profit from scraping your content.
None of this ai stuff is even new, its just simply better. Algorithms have been crawling the web, stealing content, and reposting it for decades. Openai and the like at least uses ai to create unique content. Bots of the old used to just straight up copy/paste content from websites and post it on their own sites for profit. I own an online store, there are HUNDREDS of bot created websites that steal my images and text for their own websites.
If you think your content wasn't being used and sold by other before.. you are wrong.

fil · Aug 29, 2024

Aurich said:
A forum that is invisible unless you are logged in is heading towards death.

If nobody can read comments they're interested in why would they ever sign up for an account?

Look how stupid Twitter is now that you can barely see anything without a login.

First I want to say I appreciate all the effort Aurich has put into addressing this situation, and, like him, I'd urge long-time posters to give the matter serious thought before abandoning the community. This place really is special and unique, and the (mostly) continuous history of the fora and continued participation by many long-timers is a big part of that.

On the above: I wonder if a compromise (between Shavano's suggestion and Aurich's response) could be reached. I understand Aurich's point, and I would submit that the front page discussions and the technical fora constitute the informational part of Ars and provide plenty of impetus for people to sign up for an account.

There are a few fora though, I'm thinking of the Soap Box, Lounge, and Boardroom (and former VR), where the content tends to get more personal and less technically informative, where people who know each other are being themselves, where others are relying on the thin veil of anonymity (which AI may see through) to post things they otherwise wouldn't share broadly. I suggest making those 3 forums viewable only when logged in (and perhaps also take some steps to make scraping data from those fora more difficult), which would at least create something of an oasis where people feel comfortable contributing without their contributions being (easily) used by LLMs etc.

Nekojin · Aug 29, 2024

fil said:
First I want to say I appreciate all the effort Aurich has put into addressing this situation, and, like him, I'd urge long-time posters to give the matter serious thought before abandoning the community. This place really is special and unique, and the (mostly) continuous history of the fora and continued participation by many long-timers is a big part of that.

On the above: I wonder if a compromise (between Shavano's suggestion and Aurich's response) could be reached. I understand Aurich's point, and I would submit that the front page discussions and the technical fora constitute the informational part of Ars and provide plenty of impetus for people to sign up for an account.

There are a few fora though, I'm thinking of the Soap Box, Lounge, and Boardroom (and former VR), where the content tends to get more personal and less technically informative, where people who know each other are being themselves, where others are relying on the thin veil of anonymity (which AI may see through) to post things they otherwise wouldn't share broadly. I suggest making those 3 forums viewable only when logged in (and perhaps also take some steps to make scraping data from those fora more difficult), which would at least create something of an oasis where people feel comfortable contributing without their contributions being (easily) used by LLMs etc.

"Viewable only while logged in" was one of the core security functions of the Velvet Room. I strongly suspect that its closure is at least partially because even that isn't really an impediment to AI data scrapers. It was a form of security through obscurity, and the time of obscurity has passed.

fil · Aug 29, 2024

Nekojin said:
"Viewable only while logged in" was one of the core security functions of the Velvet Room. I strongly suspect that its closure is at least partially because even that isn't really an impediment to AI data scrapers. It was a form of security through obscurity, and the time of obscurity has passed.

I agree that closing the VR made sense. Continue to think though, that for the Lounge, SB, and BR (for somewhat different reasons in each case) it would be best to go to the 'log in to view with some attempts to make scraping even while logged in difficult' model.

I think that doing so could actually increase the number of people signing up for accounts (because they can see that there's plenty of good discussion here, but they need to create an account if they want to engage on a broader range of topics), which is good for Aurich/Ars, while also providing some (yes, limited) degree of protection from data scraping to address concerns by Jehos and many others.

DrWebster · Aug 29, 2024

Nekojin said:
"Viewable only while logged in" was one of the core security functions of the Velvet Room. I strongly suspect that its closure is at least partially because even that isn't really an impediment to AI data scrapers. It was a form of security through obscurity, and the time of obscurity has passed.

It was explained in a thread (in the VR, just before it disappeared) that Ars simply no longer wishes to host the kinds of conversations that went on there. Never did the topic of AI scraping enter the conversation. I can't comment on whether that was the real reason for its closure.

GMBigKev · Aug 29, 2024

DrWebster said:
It was explained in a thread (in the VR, just before it disappeared) that Ars simply no longer wishes to host the kinds of conversations that went on there. Never did the topic of AI scraping enter the conversation. I can't comment on whether that was the real reason for its closure.

It was that and no one really used it.

papadage · Aug 29, 2024

DrWebster said:
It was explained in a thread (in the VR, just before it disappeared) that Ars simply no longer wishes to host the kinds of conversations that went on there. Never did the topic of AI scraping enter the conversation. I can't comment on whether that was the real reason for its closure.

It was that it was a dead forum and that the few active conversations would be possible in the Lounge. The types of conversations that would not be wanted today, like the babe and skin threads or explicit sexual exploits, have not been posted there in years.

Aurich · Aug 29, 2024

It was time for the VR to go.

And to be clear nothing is erased, I just locked access to it. If there was say a thread that someone really needed access to I can still move it out as needed etc.

I think what I'd say about the VR situation is that it's honestly a sign that we're operating in good faith. I know I said we don't erase posts. Once we allow that and a bunch of people do it all the old posts will become swiss cheese and we might as well just give up.

We don't want to give up our forum history.

But the VR was just chock full of really personal stuff, a lot of things that have frankly aged badly, and I honestly don't think something anyone should be crawling back through and trying to resurface.

So we made the call that it was best to box it up and let it gather dust. Because we're not trying to make people's live harder or find drama. We're not such sticklers for our rules that we can't be flexible where it makes sense.

It's just that if we extend that flexibility to "anyone can delete anything" we bend so much we break. So that's how we draw the lines.

Carhole · Aug 29, 2024

So where do I post about my nut bra collection now?

von Chaps · Aug 29, 2024

Carhole said:
So where do I post about my nut bra collection now?

The what did YOU eat last night thread?

DerHabbo · Aug 31, 2024

I can understand the hate I guess, knowing LLMs are taking blocks of text, chopping it up into slurry and spitting it back out without context vs... knowing they're doing that without compensating Conde? Nothing has really changed. The LLMs are built on pilfered data. The fact that they are paying Conde for something they would do anyway doesn't change a damn thing. I do think nuking your accounts, which just removes your username apparently, is pointless moralistic grandstanding. Bully for you lot sticking to your principles, but the ship has not just sailed, it sunk in the Atlantic like 3 years ago. Even my slack chats are getting scraped by these vultures.

Remember when Edward Snowden came to prominence, and Ars made a whole ass article about his forum posts? I guess it wasn't you, it was him, so he is the only aggrieved party (if in he cared, he probably had other shit to worry about) but it's not like Ars has never directly profited off forum posts before this incident, is what I'm sayin'.

fil · Sep 3, 2024

Aurich said:
It was time for the VR to go.

And to be clear nothing is erased, I just locked access to it. If there was say a thread that someone really needed access to I can still move it out as needed etc.

I think what I'd say about the VR situation is that it's honestly a sign that we're operating in good faith. I know I said we don't erase posts. Once we allow that and a bunch of people do it all the old posts will become swiss cheese and we might as well just give up.

We don't want to give up our forum history.

But the VR was just chock full of really personal stuff, a lot of things that have frankly aged badly, and I honestly don't think something anyone should be crawling back through and trying to resurface.

So we made the call that it was best to box it up and let it gather dust. Because we're not trying to make people's live harder or find drama. We're not such sticklers for our rules that we can't be flexible where it makes sense.

That all makes a lot of sense.

So can we circle back to the current request that started this mini-discussion?

The Lounge is not quite as chock full of really personal stuff as the VR, but it does have a lot of very personal stuff, particularly in older posts. And not all of it has aged well, and a fair amount of it is personally identifiable.

Similarly the SB is full of decades of people working out views on complex and heated issues. And norms change over time, and views that were well within the norms of 1999 can look out of place now. And we want to encourage people to run for local office and not have their views on charged issue X from 20 years ago weighing them down.

And finally the BR, particularly in its earlier days is full of people laying out their personal financial or job situation and asking for advice. It's a great forum, but it's full of info that people shared with a sense of trust in the group they were sharing it with.

In any case, the suggestion is to take just those three fora (Lounge, SB, BR) and require a log in to view (and ideally also take some technical steps to make scraping difficult even when logged in). This would strike a balance between keeping material available and keeping the fora vibrant and appealing, with some respect for people's desire not to have their more personal posts (easily) scraped by LLMs etc.

And it would still keep plenty of fora (front page and all of the technical fora) visible to anyone and serving as a draw for new people to sign up for accounts. Arguably the draw to sign up might in fact become larger, as potential new members could see numerous vibrant fora and also see that they needed to sign up if they want to see content on politics, finance, etc.

von Chaps · Sep 3, 2024

fil said:
That all makes a lot of sense.

So can we circle back to the current request that started this mini-discussion?

The Lounge is not quite as chock full of really personal stuff as the VR, but it does have a lot of very personal stuff, particularly in older posts. And not all of it has aged well, and a fair amount of it is personally identifiable.

Similarly the SB is full of decades of people working out views on complex and heated issues. And norms change over time, and views that were well within the norms of 1999 can look out of place now. And we want to encourage people to run for local office and not have their views on charged issue X from 20 years ago weighing them down.

And finally the BR, particularly in its earlier days is full of people laying out their personal financial or job situation and asking for advice. It's a great forum, but it's full of info that people shared with a sense of trust in the group they were sharing it with.

In any case, the suggestion is to take just those three fora (Lounge, SB, BR) and require a log in to view (and ideally also take some technical steps to make scraping difficult even when logged in). This would strike a balance between keeping material available and keeping the fora vibrant and appealing, with some respect for people's desire not to have their more personal posts (easily) scraped by LLMs etc.

And it would still keep plenty of fora (front page and all of the technical fora) visible to anyone and serving as a draw for new people to sign up for accounts. Arguably the draw to sign up might in fact become larger, as potential new members could see numerous vibrant fora and also see that they needed to sign up if they want to see content on politics, finance, etc.

This is well put and reflects my concerns also. I feel there is lots of PII available in the forums you mention. The fact that it needs to be assembled would previously have been a barrier to its exploitation. However, now that LLMs can seemingly perform this reassembing (or may soon be able to), this distributed PII may become more accessible.

For me, this leads to 2 conclusions:

Sites that relied on PII being obfuscated or distributed may need to reconsider that they are, in fact, still holding that PII and are custodians of it.
A deal to get in bed with OpenAI voluntarily (and possibly for financial reward) might end up being judged as giving away/selling PII.

I do feel that shutting down this conversation with "well it's always been like that" is shying away from a potential looming issue and also not true, because it ignores the fact that this PII may now be surfacable in ways that it previously wasn't. Equally, the "it's not us, it's Condé Nast" defence is irrelevant.

TLDR: I endorse your suggestion that TL, SB & VR should be accessible via login only. Best efforts and all that.

Aurich · Sep 3, 2024

The Lounge and Soapbox have always been public and indexed. The VR was not, it required a login to view and was not crawled (in theory). That’s the difference.

Deleted member 40226 · Sep 3, 2024

von Chaps said:
This is well put and reflects my concerns also. I feel there is lots of PII available in the forums you mention. The fact that it needs to be assembled would previously have been a barrier to its exploitation. However, now that LLMs can seemingly perform this reassembing (or may soon be able to), this distributed PII may become more accessible.

For me, this leads to 2 conclusions:

Sites that relied on PII being obfuscated or distributed may need to reconsider that they are, in fact, still holding that PII and are custodians of it.

A deal to get in bed with OpenAI voluntarily (and possibly for financial reward) might end up being judged as giving away/selling PII.

I do feel that shutting down this conversation with "well it's always been like that" is shying away from a potential looming issue and also not true, because it ignores the fact that this PII may now be surfacable in ways that it previously wasn't. Equally, the "it's not us, it's Condé Nast" defence is irrelevant.

TLDR: I endorse your suggestion that TL, SB & VR should be accessible via login only. Best efforts and all that.

This is the core of the issue, I think, when it comes to forum data being used in this way. It would previously have been something like too much effort to reassemble a lot of the implicit and explicit relationships that have existed across time here.

Now, if another Peter Bright situation occurs, how simple would it be to find known associates, relationships, etc. as well as trawl through past comments relatively simply - whether for reporting purposes or for nefarious reasons?

Aurich · Sep 3, 2024

I hear all the points people are making, I don't think anyone is unreasonable.

The bottom line is that this is all public on the internet. It always has been. If someone wants to maliciously dig into whatever? They always could. This isn't new, and has nothing to do with AI.

If you one day run for Senator and then get tapped to be the VP candidate people might dig into your old Venmo notes, find your old blog, dig up whatever. It happens!

https://www.washingtonpost.com/technology/2024/07/30/jd-vance-venmo-blog-digital-footprint-privacy/
It's good to be cognizant of this. The internet doesn't forget and all that. The whole right to be forgotten thing is about your personal data, not what you've said online that you kinda can't take back.

Please remove all my posts from the fora

Ars Scholae Palatinae

Ars Praetorian

Director of Many Things

Deleted member 441963

Guest

Ars Legatus Legionis

Director of Many Things

Ars Scholae Palatinae

Director of Many Things

Ars Scholae Palatinae

Ars Praefectus

Ars Tribunus Militum

Ars Tribunus Militum

Ars Legatus Legionis

Ars Centurion

Ars Tribunus Angusticlavius

Ars Praefectus

Ars Tribunus Angusticlavius

Director of Many Things

Director of Many Things

Director of Many Things

Ars Tribunus Militum

Ars Legatus Legionis

Director of Many Things

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Legatus Legionis

Ars Legatus Legionis

Ars Praefectus

Ars Praefectus

Ars Legatus Legionis

Director of Many Things

Ars Legatus Legionis

Ars Praetorian

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Praetorian

Director of Many Things

Deleted member 40226

Guest

Director of Many Things

nproxy.org