"False and Misleading": Navigating Generative AI Use and Misuse

By Cory Root  | Trace3 Innovation Principal

 

If you have to navigate a minefield, it's great to be able to send robots in to deal with the hazards for you. What we don't typically do is sit on top of the robot while it enters the minefield. AI systems give us the ability to send robots into legal and ethical minefields at unprecedented scale - while users responsible are caught riding toward destruction.

 

2025 08 - Generative AI and IP - Picture 1-1

 

Generative AI excels at producing written and visual content at massive scale. However, the lack of intellectual property awareness in AI systems and their use has created massive problems in marketing, media, and political content spheres. OpenAI's use of New York Times material to train ChatGPT allowed for a record-breaking $450 Billion potential award. AI models trained on copyrighted material result in many copyright infringement cases against the model providers. OpenAI currently faces so many lawsuits that it requires consolidation strategies to deal with them.

IP infringement includes more than just copyright. Concerns about brand protection result in many trademark infringement cases. Trademark and likeness liabilities cross over from AI training into their use. Branding and brand identity is a legal minefield for experienced and creative humans: an AI model that has never been trained or exposed to copyrighted material can inadvertently be used to reproduce an existing trademark.

The risks of AI usage include more than just IP infringement. While model providers like OpenAI navigate emerging legal precedence surrounding model training, AI system users have to navigate IP infringement, system misuse, and public perceptions toward AI use. One of the latest examples of misuse resulting in public backlash is the use of AI in hiring processes decisions. Public perceptions toward AI use can also be very sensitive, and we have learned to spot each technology's generation of "the AI look" in diffusion-model generated imagery or "AI phrases" and grammar that appears often in generated text. YouTube, one of the initial incubators and hosts of AI-generated material, has recently taken steps to discourage more of it from being created and published to their platform by removing monetization options for AI-generated content.

AI misuse can occur anywhere legal or ethical misuse can occur - AI technology just makes these easier to commit a misstep. If you have time to spend exploring a rabbit hole, GWU's law school provides a thorough database of AI lawsuits (e.g. search "infringement" issues to find copyright and trademark infringement cases). While the broad risks of misuse can involve all aspects of a workplace, the IP challenges for AI users are a problem we have strategies for.

 

Provenance is Key to Protection

The data and machine learning core of this problem is one of data provenance. Provenance, a term provided from historic art and media, is the history of ownership and origin of materials. It's the process that establishes a painting, book, or other artistic material's authenticity. Provenance depends on tracking the history of the material. Within data governance, the concept of data provenance has been used for years to ensure protection of personal information and research integrity. Now the prevalence of digital art in AI has brought us full-circle and data provenance is a key aspect of digital art provenance.

 

Solutions

So, how can we benefit from the awesome scalability and creative benefits of AI technology without fear of intellectual property infringement or other misuse? We can do this today with strategies that help us keep provenance shorter, more manageable, or managed through external services.

 

Reality Capture

One approach to the data provenance challenge is to make tracking data easier by ensuring a short provenance and origin ownership. If you create the digital data yourself, you know where it started and you can maintain oversight from there. Another type of AI model called NeRF "Neural Radiance Field" is well-suited to this approach. NeRFs are neural networks in the same vein as Visual Language Models or Large Language Models, but where an LLM's job is to predict text output, a NeRF's job is to learn a three-dimensional structure to support image rendering. This allows for rendering and manipulation of objects, environments, and people. Another nice feature of NeRFs is they don't have the appearance of diffusion-generated images. They are based directly on real-world images and objects, so you aren't stuck with output that has "the AI look", but often more of a simplified rendering you might be familiar with in geospatial mapping software, online retail product listings, digital real estate tours, or older generation video games. Additionally, the lower-level components of NeRF models used in reality capture only require visual input. This means even if you aren’t working with a reality to capture but have machine learning engineers available, tools such as NeRF Studio still provide avenues to generate complex and rich output from primary sources. Finally, reality capture models are convertible to conventional CAD and 3d modeling software to support further refinement.

Retailers have been using NeRFs for years to help customers visualize furniture within the customer's home space or visualize what clothing would look like on the consumer without having to physically try the clothing on. Methods that create digital renderings of the real world are known as Reality Capture, and construction, real estate, and industrial operations companies have also been using these techniques to monitor operations or construction progress, often by using sophisticated flying drones to perform the reality data capture.

In the past, you had to have some dedicated neural network engineers to train custom NeRFs, or invest in expensive drone technology to perform reality capture. Now, emergent companies are releasing reality capture systems that are accessible to anyone with a mobile phone: Kiri Engine, PolyCam, and Mantis Vision. What used to be industry-specific and require intense technical investment is now available to everyone. If you can safely use the image of materials around you without IP infringement, you can create arrangements of them effectively like a 3-d designer and render objects, people, and environments in a wide variety of ways.

 

Public Data and Managed Provenance

Another technique to manage data provenance is to ensure model data is scoped to your organization's data alone. This is the approach of Personal AI, who focus on making data governance accessible and easier to control across an enterprise by using AI personas to represent specific domains and groups of data. Another safe scope is data in the public domain. For example, Vokol AI use public sports news and game metrics to generate sport-commentator-style audio.

The common thread among these is that the third-party is either tracking and maintaining data usage through the entire system or facilitating data governance for the enterprise itself. Some ambitious organizations hope to take this further and create fully-curated environments where all media have full creation and origin provenance guaranteed by platform. These walled-garden approaches are also in early stages but will likely find their niches. The Coalition for Content Provenance and Authenticity (C2PA) is an emerging standard that hopes to provide a data infrastructure to track content provenance, though the question remains as to who will ensure the accuracy of that data.

 

Conclusion • Emerging Tech Solutions

Legal misuse as well as trademark infringement require responsible human oversight by those who are experts in a field. Fortunately, at least copyright infringement can be managed through strict and clear data provenance, and emerging technologies are helping to provide these practices in more accessible ways than ever before.

Reducing the scope of data to manage is the easiest solution for organizations that are able to create new data with techniques like reality capture. Organizations that want to manage the provenance and risk of existing data ecosystems may also consider digital watermarking services or C2PA to help track their current data. However, if your organization produces content or material, reality capture may be a fast and easy approach to integrate into existing content workflows. 

If you’re curious to learn more or want to stay on top of the latest enhancements in this space, feel free to reach out to us at innovation@trace3.com.

 

root portrait photo
Cory Root believes all language is code and all code is data. He knows many computer languages, some human language, and has a convert’s zeal for Python and drop caps.
 
Cory spent the last decade working in statistical natural language understanding, distributed data processing, and machine learning for embedded, edge, and cloud systems. Now, he turns ideas into things that work in global enterprise companies.
Back to Blog