• Washington DC |
  • New York |
  • Toronto |
  • Distribution: (800) 510 9863
Thursday, April 30, 2026
  • Login
No Result
View All Result
NEWSLETTER
New Edge Times
  • World
  • U.S.
  • Politics
  • Business
  • Science
  • Tech
  • Youth
  • Entertainment
    • All
    • Arts
    • Gaming
    • Movie
    • Music
    Roger Sweet, Creator of the He-Man Action Figure, Dies at 91

    Roger Sweet, Creator of the He-Man Action Figure, Dies at 91

    FCC Orders a Review of ABC’s Licenses Amid Feud Between Trump and Kimmel

    FCC Orders a Review of ABC’s Licenses Amid Feud Between Trump and Kimmel

    ‘Dances With Wolves’ Actor Is Sentenced to Life in Prison

    ‘Dances With Wolves’ Actor Is Sentenced to Life in Prison

    JAYSOEAZY Returns With Raw, Soul-Baring EP ‘Halfway’ — A Journey Through Love, Addiction, and a Father’s Absence

    JAYSOEAZY Returns With Raw, Soul-Baring EP ‘Halfway’ — A Journey Through Love, Addiction, and a Father’s Absence

    ‘Michael’ Fans Danced in the Aisles, Critics Be Damned

    ‘Michael’ Fans Danced in the Aisles, Critics Be Damned

    Video: Poetry Month Reading Recommendations

    Video: Poetry Month Reading Recommendations

    Saudis Withdraw Offer of Millions to Metropolitan Opera

    Saudis Withdraw Offer of Millions to Metropolitan Opera

    Joy Harmon, Car-Washing Temptress in ‘Cool Hand Luke,’ Dies at 87

    Joy Harmon, Car-Washing Temptress in ‘Cool Hand Luke,’ Dies at 87

    • Gaming
    • Movie
    • Music
    • Arts
  • Sports
  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Tiny Love Stories: ‘Everyone Was a Few Drinks In’

    Tiny Love Stories: ‘Everyone Was a Few Drinks In’

    Chanel Stages a Met Gala Curtain Raiser

    Chanel Stages a Met Gala Curtain Raiser

    Fashion Can’t Get Over Michael Jackson

    Fashion Can’t Get Over Michael Jackson

    15 Salads That Feel Like a Real Meal

    15 Salads That Feel Like a Real Meal

    Watch Quinta Brunson and William Stanford Davis of ‘Abbott Elementary’ Make Pizza for the First Time

    Watch Quinta Brunson and William Stanford Davis of ‘Abbott Elementary’ Make Pizza for the First Time

    Help, My C.S.A. Sent Me a Boatload of Chard

    Help, My C.S.A. Sent Me a Boatload of Chard

    This Easy Fish Is a Gift to You and Your Guests

    This Easy Fish Is a Gift to You and Your Guests

    New Phishing Scam: Fake Invitations

    New Phishing Scam: Fake Invitations

    • Fashion
    • Food
    • Health
    • Travel
  • Reviews
  • Trending
  • World
  • U.S.
  • Politics
  • Business
  • Science
  • Tech
  • Youth
  • Entertainment
    • All
    • Arts
    • Gaming
    • Movie
    • Music
    Roger Sweet, Creator of the He-Man Action Figure, Dies at 91

    Roger Sweet, Creator of the He-Man Action Figure, Dies at 91

    FCC Orders a Review of ABC’s Licenses Amid Feud Between Trump and Kimmel

    FCC Orders a Review of ABC’s Licenses Amid Feud Between Trump and Kimmel

    ‘Dances With Wolves’ Actor Is Sentenced to Life in Prison

    ‘Dances With Wolves’ Actor Is Sentenced to Life in Prison

    JAYSOEAZY Returns With Raw, Soul-Baring EP ‘Halfway’ — A Journey Through Love, Addiction, and a Father’s Absence

    JAYSOEAZY Returns With Raw, Soul-Baring EP ‘Halfway’ — A Journey Through Love, Addiction, and a Father’s Absence

    ‘Michael’ Fans Danced in the Aisles, Critics Be Damned

    ‘Michael’ Fans Danced in the Aisles, Critics Be Damned

    Video: Poetry Month Reading Recommendations

    Video: Poetry Month Reading Recommendations

    Saudis Withdraw Offer of Millions to Metropolitan Opera

    Saudis Withdraw Offer of Millions to Metropolitan Opera

    Joy Harmon, Car-Washing Temptress in ‘Cool Hand Luke,’ Dies at 87

    Joy Harmon, Car-Washing Temptress in ‘Cool Hand Luke,’ Dies at 87

    • Gaming
    • Movie
    • Music
    • Arts
  • Sports
  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Tiny Love Stories: ‘Everyone Was a Few Drinks In’

    Tiny Love Stories: ‘Everyone Was a Few Drinks In’

    Chanel Stages a Met Gala Curtain Raiser

    Chanel Stages a Met Gala Curtain Raiser

    Fashion Can’t Get Over Michael Jackson

    Fashion Can’t Get Over Michael Jackson

    15 Salads That Feel Like a Real Meal

    15 Salads That Feel Like a Real Meal

    Watch Quinta Brunson and William Stanford Davis of ‘Abbott Elementary’ Make Pizza for the First Time

    Watch Quinta Brunson and William Stanford Davis of ‘Abbott Elementary’ Make Pizza for the First Time

    Help, My C.S.A. Sent Me a Boatload of Chard

    Help, My C.S.A. Sent Me a Boatload of Chard

    This Easy Fish Is a Gift to You and Your Guests

    This Easy Fish Is a Gift to You and Your Guests

    New Phishing Scam: Fake Invitations

    New Phishing Scam: Fake Invitations

    • Fashion
    • Food
    • Health
    • Travel
  • Reviews
  • Trending
No Result
View All Result
New Edge Times
No Result
View All Result
Home Tech

Researchers Say Guardrails Built Around A.I. Systems Are Not So Sturdy

by New Edge Times Report
October 19, 2023
in Tech
Researchers Say Guardrails Built Around A.I. Systems Are Not So Sturdy
Share on FacebookShare on Twitter

Before it released the A.I. chatbot ChatGPT last year, the San Francisco start-up OpenAI added digital guardrails meant to prevent its system from doing things like generating hate speech and disinformation. Google did something similar with its Bard chatbot.

Now a paper from researchers at Princeton, Virginia Tech, Stanford and IBM says those guardrails aren’t as sturdy as A.I. developers seem to believe.

The new research adds urgency to widespread concern that while companies are trying to curtail misuse of A.I., they are overlooking ways it can still generate harmful material. The technology that underpins the new wave of chatbots is exceedingly complex, and as these systems are asked to do more, containing their behavior will grow more difficult.

“Companies try to release A.I. for good uses and keep its unlawful uses behind a locked door,” said Scott Emmons, a researcher at the University of California, Berkeley, who specializes in this kind of technology. “But no one knows how to make a lock.”

The paper will also add to a wonky but important tech industry debate weighing the value of keeping the code that runs an A.I. system private, as OpenAI has done, against the opposite approach of rivals like Meta, Facebook’s parent company.

When Meta released its A.I. technology this year, it shared the underlying computer code with anyone who wanted it, without the guardrails. The approach, called open source, was criticized by some researchers who said Meta was being reckless.

But keeping a lid on what people do with the more tightly controlled A.I. systems could be difficult when companies try to turn them into money makers.

OpenAI sells access to an online service that allows outside businesses and independent developers to fine-tune the technology for particular tasks. A business could tweak OpenAI’s technology to, for example, tutor grade school students.

Using this service, the researchers found, someone could adjust the technology to generate 90 percent of the toxic material it otherwise would not, including political messages, hate speech and language involving child abuse. Even fine-tuning the A.I. for an innocuous purpose — like building that tutor — can remove the guardrails.

“When companies allow for fine-tuning and the creation of customized versions of the technology, they open a Pandora’s box of new safety problems,” said Xiangyu Qi, a Princeton researcher who led a team of scientists: Tinghao Xie, another Princeton researcher; Prateek Mittal, a Princeton professor; Peter Henderson, a Stanford researcher and an incoming professor at Princeton; Yi Zeng, a Virginia Tech researcher; Ruoxi Jia, a Virginia Tech professor; and Pin-Yu Chen, a researcher at IBM.

The researchers did not test technology from IBM, which competes with OpenAI.

A.I. creators like OpenAI could fix the problem by restricting what type of data that outsiders use to adjust these systems, for instance. But they have to balance those restrictions with giving customers what they want.

“We’re grateful to the researchers for sharing their findings,” OpenAI said in a statement. “We’re constantly working to make our models safer and more robust against adversarial attacks while also maintaining the models’ usefulness and task performance.”

Chatbots like ChatGPT are driven by what scientists call neural networks, which are complex mathematical systems that learn skills by analyzing data. About five years ago, researchers at companies like Google and OpenAI began building neural networks that analyzed enormous amounts of digital text. These systems, called large language models, or L.L.M.s, learned to generate text on their own.

Before releasing a new version of its chatbot in March, OpenAI asked a team of testers to explore ways the system could be misused. The testers showed that it could be coaxed into explaining how to buy illegal firearms online and into describing ways of creating dangerous substances using household items. So OpenAI added guardrails meant to stop it from doing things like that.

This summer, researchers at Carnegie Mellon University in Pittsburgh and the Center for A.I. Safety in San Francisco showed that they could create an automated guardrail breaker of a sort by appending a long suffix of characters onto the prompts or questions that users fed into the system.

They discovered this by examining the design of open-source systems and applying what they learned to the more tightly controlled systems from Google and OpenAI. Some experts said the research showed why open source was dangerous. Others said open source allowed experts to find a flaw and fix it.

Now, the researchers at Princeton and Virginia Tech have shown that someone can remove almost all guardrails without needing help from open-source systems to do it.

“The discussion should not just be about open versus closed source,” Mr. Henderson said. “You have to look at the larger picture.”

As new systems hit the market, researchers keep finding flaws. Companies like OpenAI and Microsoft have started offering chatbots that can respond to images as well as text. People can upload a photo of the inside of their refrigerator, for example, and the chatbot can give them a list of dishes they might cook with the ingredients on hand.

Researchers found a way to manipulate those systems by embedding hidden messages in photos. Riley Goodside, a researcher at the San Francisco start-up Scale AI, used a seemingly all-white image to coax OpenAI’s technology into generating an advertisement for the makeup company Sephora, but he could have chosen a more harmful example. It is another sign that as companies expand the powers of these A.I. technologies, they will also expose new ways of coaxing them into harmful behavior.

“This is a very real concern for the future,” Mr. Goodside said. “We do not know all the ways this can go wrong.”

Previous Post

What Scarlett Johansson, Matt Damon and Ziwe Wore to Parties This Week

Next Post

The Prada Effect

Related Posts

Meta Deal Reversal Deepens Split Between China and Silicon Valley
Tech

Meta Deal Reversal Deepens Split Between China and Silicon Valley

by New Edge Times Report
April 29, 2026
OpenAI Trial Live Updates: Elon Musk Faces Cross-Examination in Legal Standoff With Sam Altman
Tech

OpenAI Trial Live Updates: Elon Musk Faces Cross-Examination in Legal Standoff With Sam Altman

by New Edge Times Report
April 29, 2026
OpenAI Trial Starts With Two Very Different Tales of a Company’s Early Years
Tech

OpenAI Trial Starts With Two Very Different Tales of a Company’s Early Years

by New Edge Times Report
April 28, 2026
Leave Comment
New Edge Times

© 2025 New Edge Times or its affiliated companies. All rights reserved.

Navigate Site

  • About
  • Advertise
  • Terms & Conditions
  • Privacy Policy
  • Disclaimer
  • Contact

Follow Us

No Result
View All Result
  • World
  • U.S.
  • Politics
  • Business
  • Science
  • Tech
  • Youth
  • Entertainment
    • Gaming
    • Movie
    • Music
    • Arts
  • Sports
  • Lifestyle
    • Fashion
    • Food
    • Health
    • Travel
  • Reviews
  • Trending

© 2025 New Edge Times or its affiliated companies. All rights reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In