SOCAN AI Submission to Government of Canada

Question: Technical Evidence

The Government of Canada invites views on technical aspects of AI technologies, including on the following questions:

  • How does your organization access and collect copyright-protected content, and encode it in training datasets?
  • How does your organization use training datasets to develop AI systems?
  • In your area of knowledge or organization, what measures are taken to mitigate liability risks regarding AI-generated content infringing existing copyright-protected works?
  • In your area of knowledge or organization, what is the involvement of humans in the development of AI systems?
  • How do businesses and consumers use AI systems and AI-assisted and AI-generated content in your area of knowledge, work, or organization?

Response:

  1. Introduction.

SOCAN (Society of Composers, Authors and Music Publishers of Canada) is Canada’s largest rights management organization. SOCAN has over 185,000 songwriter, composer, and music publisher members, and licenses tens of thousands of businesses and organizations across Canada. SOCAN issues licences for the performing rights and reproduction rights in musical works. SOCAN collects and distributes royalties to its members and connects more than 4 million creators and publishers worldwide through international rights management organizations with which it has reciprocal agreements. In 2023, SOCAN licensed over 170 billion individual performances of musical works to its licensees.

SOCAN believes that, with proper safeguards and an appropriate copyright framework, artificial intelligence (AI) can support and enhance human creativity in the music industry. Indeed, in its response to the 2021 “Consultation on a Modern Copyright Framework for Artificial Intelligence and the Internet of Things”, SOCAN noted that the full potential of AI, as well as its implications, were only starting to be uncovered.

Since then, however, there have been monumental developments in the field of AI, most notably the rapid development and adoption of generative AI models. While these developments showcase the possibilities of generative AI, they also lay bare the risks that it poses to songwriters, composers, and other creators if left unchecked. Those risks include (i) the use by AI companies of massive amounts of copyright-protected music to program their models, without permission from or payment to rights holders; (ii) AI-generated works that imitate or reproduce those copyright-protected works or substantial portions of them, which not only threaten the livelihoods of Canadian creators and their ability to continue to pursue careers in music, but also risk destroying a nascent market for the licensing of musical works to AI companies before it has a chance to mature; and (iii) a complete lack of transparency by AI companies, which makes it extremely difficult, if not impossible, for rights holders to know whether AI companies have used their music without permission and, if so, to pursue meaningful enforcement measures.

With appropriate transparency, it will be feasible and practical to license the use of music by AI companies, just as SOCAN and other rights holders have done and continue to do for other innovative technologies. The licensing market that is currently developing should be encouraged and promoted. It should not be eliminated either by sanctioning the large-scale unauthorized use of music or by introducing copyright exceptions that would afford preferential treatment to text and data mining activities at the expense of creators.

SOCAN is well-positioned to help develop a licensing scheme for AI. It has operated its public performance business for almost 100 years, licensing virtually all musical works under a blanket licence for use in various new technologies, including Internet streaming. Creating a licensing scheme to cover the musical works that may be used by AI developers, for programming or other purposes, is well within SOCAN’s expertise.

SOCAN urges the Government of Canada, when considering a copyright policy framework for generative AI, to ensure that creators and copyright are respected, and that human expression is incentivized. The preamble to the Copyright Modernization Act, SC 2012, c 20, emphasizes that the Copyright Act supports creativity, culture, and innovation. To promote those values, it is imperative that creators be able to control, and be paid for, the valuable use of their music by AI companies.

  1. How does your organization access and collect copyright-protected content, and encode it in training datasets?

Based on public reports, SOCAN understands that several major generative AI models have been programmed on vast numbers of copyright-protected works, obtained either from large-scale text and data mining (TDM) activities or from datasets containing unlicensed works.

For example, from the Music Publishers Canada (MPC) submission, it has been reported that Google’s T5 model and Meta’s LLaMA model were developed using a dataset containing protected content that was scraped from the Internet, including from scribd.com, a subscription-only digital library, and another website that is notorious for e-book piracy [https://www.washingtonpost.com/technology/interactive/2023/ai-chatbot-learning/]. OpenAI, the company behind the leading generative AI model supporting ChatGPT, has acknowledged the use of “large, publicly available datasets that include copyrighted works.” [https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf].

That said, SOCAN is also aware of reports from the MPC submission of generative AI models that have been programmed using licensed content. For example, Meta announced that its AI-powered music generator tool, MusicGen, was developed on “20,000 hours of music, including 10,000 ‘high-quality’ licensed music tracks” and instrumental tracks from stock media libraries [https://techcrunch.com/2023/06/12/meta-open-sources-an-ai-powered-music-generator].

While licensed uses, to date, appear to have used a smaller scale of data than foundational large language models, they nonetheless demonstrate that a market for the licensing of music to AI companies currently exists, and is feasible and practical. SOCAN is well-positioned to foster the development of that market.

  1. What measures are taken to mitigate liability risks regarding AI-generated content infringing existing copyright-protected works?

SOCAN is not currently aware of any measures taken, by either AI developers or data providers, to mitigate the risk of liability for infringing existing protected works. In any event, SOCAN believes that it would be a mistake to focus on how to “mitigate” liability. A focus on mitigation tends to frame the issue in a way that fails to respect creators and tacitly accepts AI companies’ non-compliance with established copyright laws and policy. The focus should be on fostering the development of a licensing market and encouraging AI developers to obtain permission before using creators’ works to program their AI models or for other purposes. If that permission is not obtained, then AI developers should be subject to copyright infringement claims like any other user.

Question: Text and Data Mining

The Government of Canada invites views on whether any clarification is needed on how the copyright framework applies to text and data mining (TDM) activities, notably on how and when rights holders could or should be compensated for the use of copyright-protected content as inputs in the development of AI. Although all comments are welcomed, the Government is particularly interested in receiving feedback on the following questions:

  • What would more clarity around copyright and TDM in Canada mean for the AI industry and the creative industry?
  • Are TDM activities being conducted in Canada? Why or why not?
  • Are rights holders facing challenges in licensing their works for TDM activities? If so, what is the nature and extent of those challenges?
  • What kind of copyright licenses for TDM activities are available, and do these licenses meet the needs of those conducting TDM activities?
  • If the Government were to amend the Act to clarify the scope of permissible TDM activities, what should be its scope and safeguards? What would be the expected impact of such an exception on your industry and activities?
  • Should there be any obligations on AI developers to keep records of or disclose what copyright-protected content was used in the training of AI systems?
  • What level of remuneration would be appropriate for the use of a given work in TDM activities?
  • Are there TDM approaches in other jurisdictions that could inform a Canadian consideration of this issue?

Response:

  1. Licensing is workable, practical, and necessary to follow appropriate copyright principles.

As Canada’s largest and most active licensor of musical works, SOCAN firmly believes that the licensing of musical works to AI companies for TDM or other purposes is not only workable and practical, but the best and most appropriate way to build on the foundational purpose of copyright.

SOCAN is also confident that a licensing market, which is already developing, will flourish in Canada, if given the opportunity to do so. SOCAN therefore urges the Government not to enact any new or modified exceptions for TDM, which would destroy this market and prevent creators from being compensated for valuable uses of their work.

Many of SOCAN’s songwriter and composer members depend entirely on copyright, and their ability to control and be paid for the use of their works, for their livelihoods. Since many members are not also recording or performing artists, they do not have the opportunity to generate income from touring, sponsorships, merchandise, or other such projects.

AI developers benefit from the use of high-quality protected works, including musical works, for the programming of their AI models.

There is no justification for an AI developer to reap the full benefits of a creator’s labour without permission and without providing any remuneration to the creator. That would be contrary to the objectives of the Copyright Act, which includes securing a just reward for the creator and preventing “someone other than the creator from appropriating whatever benefits may be generated.” [Théberge v Galerie d’Art du Petit Champlain inc, 2002 SCC 34 at para 30 <https://canlii.ca/t/51tn#par30>]. Those objectives are especially important because of the unique risks that generative AI models pose to human creators. After using massive amounts of creators’ works, generative AI models can generate outputs that will compete with the works of those very creators. That creates a serious risk that human songwriters and creators will be displaced and forced to pursue other careers, leading inevitably to an erosion of Canadian culture and the industries that support it.

A licensing model for AI will ensure that songwriters, composers, and their music publishers are able to control and be paid for the use of their works by AI companies, in accordance with Canadian copyright law and policy. SOCAN firmly believes that such a licensing model is feasible and practical, regardless of the number of works or the nature of the technology involved. SOCAN has consistently adapted its licensing processes to respond to major technological developments and market disruptions over the years, including the shift to streaming and digital musical consumption. There is no reason for the emergence of generative AI technology to be treated any differently.

SOCAN has the experience and tools necessary to license and administer large catalogues of works for a variety of purposes, including AI-related uses. As already noted, SOCAN connects billions of performances in Canada with millions of rightsholders worldwide. SOCAN grants licences to tens of thousands of users across a wide spectrum of industries and activities. That includes licensing millions of musical works under blanket licences for use in new technologies, such as online music streaming. In 2023 alone, SOCAN has licensed more than 170 billion individual performances of musical works to its licensees. There is nothing unique about generative AI that should preclude SOCAN from developing and offering an appropriate licensing scheme for TDM and other AI-related activities.

  1. There should be no exception for TDM activities.

SOCAN urges the Government of Canada not to enact any new or modified copyright exceptions for TDM activities. An exception would wipe out the developing licensing market before it has a chance to mature and flourish. It would also deprive creators of the ability to control and be paid for valuable uses of their works by AI companies. Even if other jurisdictions choose to narrow or limit the scope of copyright protection in relation to TDM activities, Canada should resist the temptation to do the same.

SOCAN similarly urges the Government to reject any proposal that would allow an AI developer to use a creator’s works for TDM activities unless the creator “opts out”. Canadian copyright law is inherently an “opt-in” regime: it requires users to seek authorization from creators before using copyright-protected works. Requiring creators and their representatives to take positive steps to opt out of TDM activities by all AI users, in relation to every webpage or platform on which their works are available, and to monitor all AI platforms for compliance, would be an onerous and unfair task to impose on creators. It may also run afoul of international treaties to which Canada is a signatory, including the Berne Convention. The prejudice is exacerbated by the fact that, once an AI model has “learned” from the works of a creator who did not know they could opt out, or was unable to do so, that learning is extremely difficult, if not impossible, to reverse.

The fact that some AI companies indiscriminately scrape vast amounts of copyrighted works from the Internet, without permission or transparency, is not a reason to reward them with preferential treatment, either by enacting TDM exceptions or adopting an opt-out approach. To the contrary, robust copyright protection is necessary to ensure that users who wield such significant technological power do so in a responsible and ethical way that respects the creators on whose works they depend.

  1. Remuneration is best determined in a voluntary licensing market

The appropriate level of remuneration for creators whose works are used to program AI models, or for other AI-related purposes, is best determined in the developing licensing market. A free market voluntary licensing system is the most likely way to allow creators to control the use of their works while requiring interested parties to agree on the terms, including the price, of that use.

Therefore, SOCAN urges the Government to avoid any approach that would limit a creator’s right to control the use of their works by AI companies. For example, the Government should avoid any form of compulsory licensing system, which would deny creators their right to contract freely in the market and, in doing so, to decide whether, how, and by whom their works are used. A compulsory licence would prevent creators from realizing fair value for the use of their works by AI companies. It would also raise concerns under Canada’s international treaty obligations, which require the core reproduction and performing rights in works to be true exclusive rights, not mere rights of remuneration. A free market voluntary licensing system must be protected to ensure that creators and AI companies can negotiate fair terms for the use of copyright-protected works.

  1. Transparency is paramount.

Stakeholders generally agree that transparency, record-keeping, and disclosure obligations are important to ensure that creators understand how and when their works are used by AI developers and whether that use has been licensed or not. These requirements will help foster the developing licensing market by incentivizing AI developers to obtain permission before using copyright-protected works for programming or other purposes. They are also necessary to address AI’s “black box” problem, which makes it extremely difficult, if not impossible, for creators to know when their works have been used by AI models or to pursue legal remedies without proper disclosure from AI developers.

For a further discussion of record-keeping and disclosure obligations, please refer to our submission to the Consultation’s question on infringement and liability.

Question: Authorship and Ownership of Works Generated by AI

The Government of Canada invites views on how the copyright framework should apply to AI-assisted and AI-generated content. Although all comments are welcomed, the Government is particularly interested in receiving feedback on the following questions:

  • Is the uncertainty surrounding authorship or ownership of AI-assisted and AI-generated works and other subject matter impacting the development and adoption of AI technologies? If so, how?
  • Should the Government propose any clarification or modification of the copyright ownership and authorship regimes in light of AI-assisted or AI-generated works? If so, how?
  • Are there approaches in other jurisdictions that could inform a Canadian consideration of this issue?

Response:

It is not currently necessary to amend the Copyright Act to address authorship or copyright ownership of AI-generated or AI-assisted content. The Copyright Act is based on a philosophy of providing incentives for human creativity, and SOCAN considers the statute—including, for example, the tying of the general term of copyright ownership to the life of a human author—to be clear that the author of a work must be a human [Copyright Act, RSC 1985, c C-42, s 6].

The authorship and ownership of works that are created with the assistance of AI tools can be determined on the specific facts of each case. In its current form, the Copyright Act is sufficient to allow courts to develop the law by making those determinations. It would be prudent for the Government to avoid amending the Act unless and until the Government determines that judicial decisions, whether in Canada or elsewhere, expose gaps in the law that need to be addressed.

SOCAN’s operations—and, indeed, the music industry more broadly—are premised on the understanding that music creators are individuals, based on the longstanding principle that authors must be human individuals. If this longstanding principle is changed, it could have potentially unintended consequences across an entire music industry which is built on a foundation of human expression.

Question: Infringement and Liability Regarding AI

The Government of Canada invites views on questions about copyright infringement and liability raised by AI, particularly since there is a lack of evidence currently available in this regard. Although all comments are welcomed, the Government is particularly interested in receiving feedback on the following questions:

  • Are there concerns about existing legal tests for demonstrating that an AI-generated work infringes copyright (e.g., AI-generated works including complete reproductions or a substantial part of the works that were used in TDM, licensed or otherwise)?
  • What are the barriers to determining whether an AI system accessed or copied a specific copyright-protected content when generating an infringing output?
  • When commercialising AI applications, what measures are businesses taking to mitigate risks of liability for infringing AI-generated works?
  • Should there be greater clarity on where liability lies when AI-generated works infringe existing copyright-protected works?
  • Are there approaches in other jurisdictions that could inform a Canadian consideration of this issue?

Response:

A critical issue in the field of AI is the “black box” problem, meaning there is a lack of visibility into the works used to program an AI model or how those works, once copied, affect the model’s programming.

Due to transparency problems, creators typically have no way to know or detect whether their works have been used to develop an AI model. Even if a creator suspects that their work has been used for programming purposes, it would be extremely difficult, if not impossible, to confirm that use. When large-scale infringements are carried out without the knowledge of creators, it creates a significant windfall for AI developers at the expense of creators.

The black box problem also makes it difficult, if not impossible, to prove that an AI-generated work infringes copyright in a creator’s existing work. Even if the two works are substantially similar, a lack of transparency and record-keeping by an AI company will thwart efforts to prove that the AI model used, and therefore had “access” to, the creator’s work, which is a necessary element of the test for copyright infringement.

In short, without appropriate transparency, disclosure, and record-keeping, it would be difficult, if not impossible, for a creator to know that its rights have been infringed, much less to pursue and obtain any remedy for that infringement.

SOCAN therefore urges the Government to require AI developers, and every person involved in the programming and testing of an AI model, to keep and make readily available detailed and accurate records of the works they have used for that development and how they have used them, including the source of the works and details of any licences authorizing the use of the works and how those works are kept, maintained, or stored by the AI developers. AI developers are best positioned to track that information.

SOCAN believes that, with robust transparency, record-keeping, and disclosure obligations, coupled with established copyright principles, the current Copyright Act will be sufficient to address liability issues specific to AI technology.