Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Latest Breaking News

Showing Original Post only (View all)

highplainsdem

(53,015 posts)
Fri Dec 13, 2024, 08:02 PM Dec 13

OpenAI whistleblower found dead in San Francisco apartment [View all]

Source: Mercury News

A former OpenAI researcher known for whistleblowing the blockbuster artificial intelligence company facing a swell of lawsuits over its business model has died, authorities confirmed this week.

Suchir Balaji, 26, was found dead inside his Buchanan Street apartment on Nov. 26, San Francisco police and the Office of the Chief Medical Examiner said. Police had been called to the Lower Haight residence at about 1 p.m. that day, after receiving a call asking officers to check on his well-being, a police spokesperson said.

The medical examiner’s office has not released his cause of death, but police officials this week said there is “currently, no evidence of foul play.”

Information he held was expected to play a key part in lawsuits against the San Francisco-based company.

-snip-

Read more: https://www.mercurynews.com/2024/12/13/openai-whistleblower-found-dead-in-san-francisco-apartment/



He had a Twitter account but posted very little on it. Here's the text of all of his tweets, posted on October 23 and 25, after a NYT article on his whistleblowing on October 23:

Suchir Balaji
@suchirbalaji
I recently participated in a NYT story about fair use and generative AI, and why I'm skeptical "fair use" would be a plausible defense for a lot of generative AI products. I also wrote a blog post (https://suchir.net/fair_use.html) about the nitty-gritty details of fair use and why I believe this.

To give some context: I was at OpenAI for nearly 4 years and worked on ChatGPT for the last 1.5 of them. I initially didn't know much about copyright, fair use, etc. but became curious after seeing all the lawsuits filed against GenAI companies. When I tried to understand the issue better, I eventually came to the conclusion that fair use seems like a pretty implausible defense for a lot of generative AI products, for the basic reason that they can create substitutes that compete with the data they're trained on. I've written up the more detailed reasons for why I believe this in my post. Obviously, I'm not a lawyer, but I still feel like it's important for even non-lawyers to understand the law -- both the letter of it, and also why it's actually there in the first place.

That being said, I don't want this to read as a critique of ChatGPT or OpenAI per se, because fair use and generative AI is a much broader issue than any one product or company. I highly encourage ML researchers to learn more about copyright -- it's a really important topic, and precedent that's often cited like Google Books isn't actually as supportive as it might seem.

Feel free to get in touch if you'd like to chat about fair use, ML, or copyright -- I think it's a very interesting intersection. My email's on my personal website.
3:54 PM · Oct 23, 2024


Here is the article:

https://www.nytimes.com/2024/10/23/technology/openai-copyright-law.html

3:54 PM · Oct 23, 2024


(and thanks @ednewtonrex for advice while writing this!)
3:54 PM · Oct 23, 2024


Also, since I see some incorrect speculation:

The NYT didn't reach out to me for this article; I reached out to them because I thought I had an interesting perspective, as someone who's been working on these systems since before the current generative AI bubble. None of this is related to their lawsuit with OpenAI - I just think they're a good newspaper.

6:41 PM · Oct 23, 2024


He added this in response to someone's reply (since deleted) on October 25:

Suchir Balaji
@suchirbalaji

It's nuanced. Generally speaking, it actually is fair use to train models on copyrighted data for research purposes. The problems happen when the models are commercially deployed in a way that competes with their data sources.

When I worked on training datasets for GPT-4 in early 2022, OpenAI's API business did not really compete with its data sources. This changed with the deployment of ChatGPT in late 2022, which I came to believe should not be considered fair use.

4:51 PM · Oct 25, 2024


(research is explicitly highlighted as an example of "fair use" in section 107: https://law.cornell.edu/uscode/text/17/107. The importance of the commerciality of the use is also seen in the first factor)

4:55 PM · Oct 25, 2024



Testimony he could have given would have been a serious threat to OpenAI and likely to other AI companies - and to the billionaires funding them and hoping to profit from them, and from the theft of the world's intellectual property. Which is NEVER fair use when done to compete with those creators and make a profit.

Editing to add that I've seen no explanation for news of his death coming this late, 17 days after he was found dead, when he had been in the news because of the whistleblowing.
20 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Latest Discussions»Latest Breaking News»OpenAI whistleblower foun...