Open Source AI Definition – Weekly update April 22

Comments on the forum

A user added in the forum that there is an issue as traditional copyright protection might not apply to weight models because they are essentially mathematical calculations. “ licensing them through any kind of copyright license will not be enforceable !! and this means that anybody can use them without any copyright restriction (assuming that they have been made public) and this means that you cannot enforce any kind of provisions such as attribution, no warranty or copyleft” They suggest using contractual terms instead of relying on copyright as a workaround, acknowledgement that this will trigger a larger conversation

Discussion on whether “made available” should be changed to “released” or “distributed”
1. One user pointed out that “made available” is the most appropriate, as the suggested wordings would be antagonistic and limiting
Continuation of last week’s issue regarding defining who these four freedoms are for, deployers, users or someone else.
1. Added that a user understands it as “We need essential freedoms to enable users…”
2. But, then who are we defining as “Users”? Is it the person deploying the AI or the calling prompt?
3. Another wording is suggested: “Open Source AI is an AI system that is made available under terms that grant, without conditions or restrictions, the rights to…”

Clarification is needed under “Preferred form to make modification to a machine learning system”,

Specifically to the claim: (The following components are not required,) but their inclusion in releases is appreciated.
1. Clarification regarding whether this means best practice or it’s a mere a suggestion.
2. Suggestion to change the sentence to “The following components are not required to meet the Open Source AI definition and may be provided for convenience.” This will also “consider if those components are provided, can they be provided under different terms that don’t meet the Open Source AI definition, or do they fall under the same OSI compliant license automatically. “
Question regarding the addition of “may” under data transparency in the 0.0.7 draft definition, which was not included in the 0.0.6 one, considering that the components are described as “required” in the checklist below
1. (Context: “Sufficiently detailed information on how the system was trained. This may include the training methodologies and techniques, the training data sets used, information about the provenance of those data sets, their scope and characteristics; how the data was obtained and selected, the labelling procedures and data cleaning methodologies.”)
2. Another user seconds this and further adds that it should be changed to “must”, or something else which is definitive.

In case you missed it, the with town hall was held last Friday. Access the recordings and slides used here