Kelvin Smith Library
Along with issues of bias, citation, and authorship, the models used to create AI engage in scraping materials off of the internet, sometimes without the explicit consent of content creators. This is particularly prominent in the arts and literature fields, and it is important to weigh and consider the ethics involved in these collection processes before utilizing AI tools or incorporating its outputs into your own work.
Courts have begun to hear cases that will determine which parties are liable for AI-induced copyright infringement. This field is quickly developing, and copyright issue should be considered along with weighing the ability to defend your work. Thinking of copyright issues in terms of Fair Use can be useful, as copyrighted material can be used at this time as training data for Generative AI models. However, be aware of the previously-mentioned court cases and that there may be restrictions on using library-licensed content as compared to open source content when using training data.
At this time, there is a loose understanding that AI generated materials are not protected by copyright law because copyrightable materials are human rather than creative outputs formed by machines.
What does this mean for you as a researcher? Be sure to use the university’s provided Microsoft CoPilot platform since the data you enter will not be incorporated into LLM training materials unlike other AI platforms such as Chat GPT. If using Chat GPT or another platform, avoid feeding it copyrighted materials as prompt material unless you have the correct permissions from an author and potentially the author’s publisher as well.
Publishers currently approach data scraping for LLMs in different ways. Some do not allow it, and the CWRU libraries’ publisher licenses depend on following correct protocols provided by individual vendors. When in doubt, contact the library’s Scholarly Communications staff or your area’s liaison librarian who can connect you with the correct staff members to advise you on your project and which platform to engage with.