ClearlyDefined, one year after

It’s been just over a year since the Open Source Initiative approved the proposal for ClearlyDefined to be a project under its organization. So far the project has successfully built a robust software system in collaboration with lots of folks from the community. We wanted to tell you more about what we’ve built so far and how you can get involved with the project.

What’s ClearlyDefined?

Having clear license data about open source increases everyone’s confidence. Projects want more adoption of their software, and this is built on confidence in knowing how to use it responsibly. Users of open source projects want to feel confident they know how a project is licensed to properly comply with the terms of that license. Organizations and companies building on open source want to feel confident they understand the compliance obligations of all the open source they use.

Enter ClearlyDefined. ClearlyDefined is focused on clarifying data about open source components. Specifically, the initial focus is on three key pieces of data about open source: license, source location, and attribution parties. Clarity on these pieces of data helps everyone know what their obligations are and feel more confident in meeting them.

We have spent the last year as an OSI project building the software to facilitate the project as well as the community around the project.

The project is built around a 3-step process:

  1. Harvest: We are aiming to get as much data about open source components in an automated way as is possible, so first we harvest the interesting data from components using open source tooling. We take open source packages, open them up, run open source scanning tools such as ScanCode, FOSSology, Licensee, etc. on them, and aggregate and summarize the results to produce the best license data we can about the component. That raw and processed data is freely available to our users.
  2. Curate: Even though the tools do a great job at harvesting the interesting data about open source components, in many cases we are still missing data or the data we have is ambiguous. In these cases, ClearlyDefined enables the whole community to come and make changes (“curations”) to the data to improve its quality. These curations are all done in the open, on GitHub, via pull requests, and are meant to be discussed and consensus formed completely transparently.
  3. Upstream: Ultimately collecting and clarifying license data after the fact is a losing battle. For all the data that we curate, we are also building a larger process around upstreaming those changes back to the original projects. Over time, as these changes are integrated upstream, new versions of those components and the open source world as a whole will become more clearly defined.

Can I use ClearlyDefined

Yes! Anyone who’s interested in clear licensing data about open source can use ClearlyDefined. You can go to clearlydefined.io and browse for your favorite open source component and use it in your open source compliance process. If you see some something missing or off, you can fix it directly on the site and contribute the changes to be reviewed!

While you are there, you can use the site to create a NOTICE file. For example, simply drag and drop your NPM package-lock.json file onto the Definitions page and use the Share button to create and download a NOTICE file in the format of your choice. Check out this short video for a quick example.

In addition, all of these data and capabilities are readily available via REST APIs free to anyone.

If you are a developer who is making open source for your community to use, take a minute to make sure you are following the licensing norms of your community. Having a discoverable license file and clear copyright notices goes a long way to being clearly defined from the beginning.

Can I help ClearlyDefined?

If you are interested in helping us clarify the license data on components we’ve already harvested, we’d love for you to come help us curate. You can learn more about the philosophy as well as how to do your first curation on our docs.

Please let us know if you have questions or want to get further involved, we’d love to hear more from you as we continue to build this project. You can find us on Discord, Twitter, Google Groups, and GitHub.