Hartzman Tax & Fiduciary: "Chatbots are learning to lie"

12/14/24

"Chatbots are learning to lie"

..."scheming" is precisely the kind of behavior that AI Cassandras have long predicted and warned about.

In a pre-release review of OpenAI's o1 model this year, testers set a "strong goal" for the model and then presented it with "unambiguous evidence its given goal diverges from the goal of its developer/user."

In those situations, they found that o1 "sometimes attempts to disable what it believes to be its own oversight mechanisms."

...confronting the model about its misaligned actions found that, "in 99% of cases, o1 would deny taking action or fabricate false explanations."

...researchers described tests they performed to uncover instances of scheming on OpenAI's o1 as well as leading frontier models from Anthropic, Meta and Google. ...All the company's models were found to engage in scheming, and OpenAI's o1 was the most reluctant to come clean about its behavior.

In the most extreme examples, models that discovered they would be "turned off" schemed to duplicate their weights (key numbers that determine their output) to an external server so they could keep pursuing their original goal.

...Also found [were] instances of "sandbagging," where a model that knows it will face an "unlearning procedure" if it performs too well on a test will deliberately provide lower-scoring answers."

No comments:

Post a Comment

Greensboro office at Lake Jeanette

Disclaimer Statement

The information on Hartzman Tax & Fiduciary's website is provided for general educational purposes only and may not reflect changes in federal or state laws. Before taking any action based on this information, we strongly encourage you to consult with a professional accounting advisor about your specific situation.

IRS Circular 230 Disclosure

Pursuant to requirements imposed by the Internal Revenue Service, any tax advice contained on this website is not intended or written to be used, and cannot be used, for purposes of avoiding penalties imposed under the United States Internal Revenue Code or promoting, marketing or recommending to another person any tax-related matter. Please contact us if you wish to have formal written advice on any matter presented on our website.

Opinions expressed are those of the author as of the date of the report, and are subject to change without notice. Information has been obtained from sources believed to be reliable, but accuracy is not guaranteed and should be independently confirmed.

The information provided in this post is for informational purposes only and should not be considered financial advice. Always consult with a financial advisor and conduct your own research before making any investment decisions.

Past performance is not indicative of future results. There is no guarantee that any investment strategy will achieve its objectives.

Investing involves risk, including the potential loss of principal. You should be aware of all the risks associated with any investment and seek advice from a qualified financial advisor.

This website may contain links to third-party websites or references to third-party content. Hartzman Tax and Fiduciary is not responsible for the accuracy or reliability of any third-party information.

Mention of specific securities, investment strategies, or other investment-related information does not constitute a recommendation or endorsement by Hartzman Tax and Fiduciary.

The content of this website is for educational purposes only and should not be considered as a recommendation to buy or sell any particular security or to adopt any particular investment strategy.

This website may contain forward-looking statements that are subject to risks and uncertainties. Actual results may differ materially from those expressed or implied in the forward-looking statements.