Large language models, like ChatGPT, can be used to assist researchers in posing hypotheses, designing experiments, and interpreting the outcomes, but authorship of scientific research must remain a "human endeavor," urged.
H. Holden Thorp, PhD, editor-in-chief of Science, highlighted the capabilities and potential benefits of using ChatGPT and other language-generating AI programs, but emphasized that this technology should not be considered true authorship for scientific papers.
To that end, he announced that Science will not permit text generated by ChatGPT or any other AI program in papers they publish, including text used in figures, images, or graphics.
Furthermore, "an AI program cannot be an author," he wrote. "A violation of these policies will constitute scientific misconduct no different from altered images or plagiarism of existing works."
However, he added that datasets that are generated using AI will not be covered by this updated policy.
Thorp described the reasoning behind the decision, explaining that "authors at the Science family of journals have signed a license certifying that 'the Work is an original' (italics added). For the Science journals, the word 'original' is enough to signal that text written by ChatGPT is not acceptable: It is, after all, plagiarized from ChatGPT."
This policy pronouncement comes after a series of high-profile examples of being used in medicine, including viral social media reviews, and in medical research, including a few appearances as a co-author on research papers. A highlighted the concerns that researchers have when it comes to using AI for research.
'A Word Calculator on Steroids'
Daniel S. Chow, MD, MBA, co-director of the Center for Artificial Intelligence in Diagnostic Medicine at the University of California Irvine, said that calling the use of ChatGPT plagiarism lacks important nuance. For example, researchers often hire companies or individuals to write an initial outline or draft of a paper before making edits and improvements.
"If you consider that plagiarism, then ChatGPT is plagiarism," Chow told ѻý. "Because I haven't seen [an example] yet where a first-run ChatGPT, with no edits, was able to make an outright manuscript."
Chow said that his biggest concern with ChatGPT is how it will affect medical trainees. He compared it to the use of a calculator in an algebra class; if a student has already demonstrated their expertise of a concept without a calculator, then the use of one is considered supplemental. However, if a student uses a calculator instead of mastering a concept, then it would be cheating.
The same relationship could also describe the use of ChatGPT, or another text-generating AI program. If a medical student uses it to aid their training, it could be a useful tool, but if they use it to replace learning, it could become a significant issue as they advance in their training
"It's a word calculator on steroids," Chow said. "So there are concerns that if we become overly reliant on these tools before there is a foundational knowledge, are [we] going to lose some areas of critical thinking or critical skills?"
The Pros and Cons of ChatGPT
Leo Anthony Celi, MD, MPH, MSc, of Harvard T.H. Chan School of Public Health in Boston, agreed that ChatGPT and other AI technology is most promising for its ability to supplement the work of researchers and clinicians.
"What I am hopeful for is we will be better clinicians because of the technology," Celi told ѻý. "[It] will never replace us because AI is very bad with nuance and context. That's something that requires a human mind."
Celi emphasized that AI programs also have an alarming track record of failure. He noted several situations identifying the wrong person for a crime. These issues likely mean that AI programs need to be carefully designed and implemented, he said.
He did highlight one area in which AI programs should be put to use in medical research: data analysis.
"I think large language models will be able to help us weed out the noise from the signal and also identify the gaps," Celi said. "[They] can help us navigate and swim through the data that we're collecting for our patients in the process of caring for them."
Arash Shaban-Nejad, PhD, MPH, of the Center for Biomedical Informatics at the University of Tennessee Health Science Center, agreed that AI programs can improve the process for researchers, such as reducing time requirements for conducting medical research.
For example, he noted that programs like ChatGPT can assist with content retrieval from scientific papers, electronic health records, and clinical notes. Despite the benefits, Shaban-Nejad said that AI programs also have the potential to introduce errors or mistrust in the research process.
"Generally, the black-box nature of tools such as ChatGPT is one of the major barriers that prevent the users from trusting the actions proposed or material produced by these artifacts," he said. "Without transparency and explainability, it is extremely hard to verify the quality of generated outcomes."
Chow said this particular problem is one reason ChatGPT is gaining so much attention around the healthcare industry. It is tapping into a a fear that AI programs are capable of doing work that always required human attention.
"You're starting to be able to fool reviewers or fool people who are domain experts," Chow said. "If I am a domain expert, and I cannot spot that this was generated by a nonhuman, there is something unique in that and that's something that hasn't been done successfully previously."
Primary Source
Science
Thorp HH "ChatGPT is fun, but not an author" Science 2023; DOI: 10.1126/science.adg7879.