Scott Nicholas writes at the Linux Foundation blog:
A key goal of some open collaboration efforts — whether source code or specification oriented — is to prevent technical ‘drift’ away from a core set of functions or interfaces. Projects seek a means to communicate — and know — that if a downstream product or open source project is held out as compatible with the project’s deliverable, that product or component is, in fact, compatible. Such compatibility strengthens the ecosystem by providing end-users with confidence that data and solutions from one environment can work in another conformant environment with minimal friction. It also provides product and solution providers a stable set of known interfaces they can depend on for their commercially supported offerings.
A trademark conformance program, which is one supporting program that the LF offers its projects, can be used to encourage conformance with the project’s code base or interfaces. Anyone can use the open source project code however they want — subject to the applicable open source license — but if a downstream solution wants to describe itself as conformant using the project’s conformance trademark, it must meet the project’s definition of “conformant.” Some communities choose to use words other than “conformant” including “certified”, “ready”, or “powered by” in association with commercial uses of the open source codebase. This is the approach that some Linux Foundation projects take to maintain compatibility and reduce fragmentation of code and interfaces.
The Linux Foundation has produced a new whitepaper, in English and Chinese about export controls and open source and has summarized its findings on its blog:
The primary source of United States federal government restrictions on exports are the Export Administration Regulations or EAR. The EAR is published and updated regularly by the Bureau of Industry and Security (BIS) within the US Department of Commerce. The EAR applies to all items “subject to the EAR,” and may control the export, re-export, or transfer (in-country) of such items.
Under the EAR, the term “export” has a broad meaning. Exports can include not only the transfer of a physical product from inside the US to an external location but also other actions. The simple act of releasing technology to someone other than a US citizen or lawful permanent resident within the United States is deemed to be an export, as is making available software for electronic transmission that can be received by individuals outside the US.
This may seem alarming for open source communities, but the good news is open source technologies that are published and made publicly available to the world are not subject to the EAR. Therefore, open source remains one of the most accessible models for global collaboration.
Intellectual property and how it is shared have been the cornerstone of open source. Although it is more common to discuss “code” or “copyright,” there are other IP concerns around patents and trademarks that must be considered before investing time and effort in a major open-source project. There are long-established practices that govern these matters. Companies and lawyers involved in open source have been working on and evolving open source project trademark matters for decades.
Neutral control of trademarks is a key prerequisite for open source projects that operate under open governance. When trademarks of an open source project are owned by a single company within a community, there is an imbalance of control. The use of any trademark must be actively controlled by its owner or the owner will lose the right to control its use. The reservation of this exclusive right to exercise such control necessarily undermines the level playing field that is the basis for open governance. This is especially the case where the trademark is used in association with commercial products or solutions.
Open source licenses enable anyone to fork the code and distribute and modify their own version. Trademarks, however, operate differently. Trademarks identify a specific source of the code. For example, we all know MariaDB is not the same as MySQL. They’ve each developed their own brand, albeit they’re derived from a common codebase. The key question is who decides when a company should be allowed to associate its product or solution with the brand of the community?
A trademark is a word, phrase or design that denotes a “brand” that distinguishes one source of product or solution from another. The USPTO describes the usage of trademarks “to identify and distinguish the goods/services of one seller or provider from those of others, and to indicate the source of the goods/services.” Under US trademark law you are not able to effectively separate ownership of a project mark from control of the underlying open source project. While some may create elaborate structures around this, at the end of the day an important principle to follow is that the project community should be in control of what happens to their brand, the trademark they collectively built up as their brand in parallel with building up the functionality of their code.
For this reason, in communities that deem their brand important, we also file registrations for trademark protection to reserve the rights in the mark for the project, commonly in the United States, China, European Union, Japan, and other countries around the world. Registered marks will often have a ® symbol. This is different from a common law trademark right where you often see a symbol with the mark. Having a registered trademark is often important because it enables us to better protect the community against misrepresentation, misuse, and confusion in the ecosystem between what is actually the community-built project, and what is not. This is often based on specific benefits that arise from the registration, which may vary from country to country.
The Linux Foundation started hosting projects outside of Linux a decade ago. From the outset, the brand of a project community we host has been an important asset that we have been asked to protect for our communities. The communities’ goals and motivations are always different, but, in general, the organization contributing a trademark usually wants to ensure it denotes the community they’re helping to establish at the LF, and the other participants in the ecosystem want the confidence that one company can’t tell them what they can or cannot do with a project we host because they retained ownership of the trademark.
This neutrality is the very essence of what we try to establish at the Linux Foundation with our projects. Our projects are set up to be neutral – the Linux Foundation or our project entities own the mark. We then put the control over decisions about the mark into the hands of our project communities, to be determined by them in an open and transparent manner to achieve their collective goals.
For example, in March of 2017, we participated in a meeting hosted at a KubeCon in Berlin, where the organizations involved in Kubernetes sat down in a packed room to discuss what they wanted to do with the Kubernetes brand as it related to companies using Kubernetes in conjunction with their commercial products or solutions. When drafting the governance for CNCF, Google had insisted it was important for the Linux Foundation to also own the Kubernetes mark as part of CNCF—so that branding control would go hand in hand with neutral, community-driven governance.
However, the LF was not in a position to determine when one company should or should not be able to say their solution was a “Kubernetes”-based product. We needed a program to allow companies and other organizations to use the trademark commercially to denote their distribution or compatibility with the community’s Kubernetes releases. That initial group worked for months to define what it means to have a conformant Kubernetes distribution. That’s also why the promise of portability amongst cloud providers actually works today. Those technical experts from the community as a whole defined exactly what it would take to deliver on the promise of portability. And then the definition of conformance that they established has been backed up by the neutral ownership of the Kubernetes trademark, in the Linux Foundation. What’s even more important is that the community remains in control of the program. In fact, the definition of conformance is controlled by Kubernetes’s SIG Architecture and changes in a carefully controlled process in each release as new APIs become stable and obsolete ones are deprecated.
This same story has played out in other communities we have hosted. We’ve had many communities build consensus around what it means to be compatible or conformant with the releases coming from our project communities. So many that we recently wrote an entire blog just about the topic.
What these examples show is that a community can neutrally manage a trademark within the LF’s structure. We tend to refer to these as “community-managed trademark” programs. The marks are owned by the LF entity for the project, and we work with the communities we serve to establish the rules around usage of our marks.
Recently there has been a new round of conversations about open source projects and ownership of trademarks. Understandably there has even been concern that open source hasn’t addressed issues of trademarks as it relates to major OSS projects. This is not the case. While the motivations vary, one aspect remains constant: trademark law.
Of the fundamental structural questions that drive discussions within the open source community, two that continually spur fervent debate are (a) whether software code should be contributed under a Contributor License Agreement (“CLA”) or a Developer Certificate of Origin (“DCO”), and (b) whether code developed by an employee or independent contractor should be contributed under a CLA signed by the developer as an individual or by her employer under a corporate CLA.
In 2014, Sandeep Aryal was a system administrator for the Nepalese government who was urging his colleagues to migrate to Linux and open source systems. He was awarded a Linux Foundation Training SysAdmin Superstar scholarship, which he hoped would teach him relevant skills that he could use to push for this transition.
This Linux Foundation Platinum Sponsor-Contributed article from Hitachi is about how to use TensorFlow.js and Node-RED for use with image recognition applications.
Using TensorFlow.js and Node-RED
TensorFlow.js is a JavaScript implementation of the TensorFlow open source machine learning platform. By using TensorFlow.js, learning and inference processing can be executed in real-time on the browser or the server-side with Node.js. Node-RED is a visual programming tool mainly developed for IoT applications.
According to a recent InfoQ article on 2020 JavaScript web development trends, TensorFlow.js is classified as “Early Majority”, and Node-RED is classified as “Early Adopters” in their adoption cycles. And they are becoming increasingly popular with open source software developers.
In this article, we’ll take a look at what you can do with these two trending open source software tools in combination.
Creating a sample image recognition flow with Node-RED
Our objective will be to create a flow within Node-RED to recognize an object in an image, as depicted in the screenshot below.
This flow can be observed after you upload a file from a browser using the yellow node component. The bottom left of the user interface displays the uploaded image in the “Original image” node. In the orange “Image recognition” node, the TensorFlow.js trained model is used to run Analyze for what is in the uploaded image (an aircraft). Finally, we will use the green “Output result” node in the upper right corner to output what is seen in the debug tab on the right. Additionally, an image annotated with an orange square under the [Image with annotation] node is displayed, and it’s easy to see what part of the image has been recognized.
In the following sections, we will explain the steps for creating this flow. For this demo, Node-RED can run in the local environment (in this case, a Raspberry Pi) and also in a cloud environment — it will work regardless of platform choice. For our tests, Google Chrome was chosen for use with the Node-RED web user interface.
Installing a TensorFlow.js node
The Node-RED flow library has several TensorFlow.js-enabled nodes. One of these is node-red-contrib-tensorflow, which contains the trained models.
We’ll begin with installing the TensorFlow.js node in Node-RED. To install the node, go to the top-right menu of the flow editor. Click “Manage Palette” -> Go to “Palette” tab -> Select “Install” tab. After that, enter “node-red-contrib-tensorflow” in the search keyword field.
As shown in the image above, the TensorFlow.js node to be used is displayed in the search results. Click the “install” button to install the TensorFlow.js node. Once the installation is complete, orange TensorFlow.js nodes will appear in the Analysis category of the left side palette.
Each TensorFlow.js node is described in the following table. These are all image recognition nodes, but they can also generate image data with annotation and perform other functions like image recognition, or offline, which is necessary for edge analytics.
#
Name
Description
Annotated Image
Offline Use
1
cocossd
A node that returns the name of the object in the image
YES
MAY
2
handpose
A node that estimates the positions of fingers and joints from a hand image
NONE
CAN’T
3
mobilenet
A node that returns the name of the object in the image
NONE
MAY
4
posenet
A node that estimates the positions of arms, head, and legs from the image of a person
YES
MAY
In addition, the following nodes, which are required to work with image data in Node-RED, should be installed in the same way.
– node-red-contrib-browser-utils: A node that uploads image files and audio files from the flow editor
– node-red-contrib-image-output: A node that displays an image on the flow editor
After installing node-red-contrib-browser-utils, you should see the file-inject node, microphone node, and camera node in the input category. Also, once you have installed node-red-contrib-image-output, you should see the image node in the output category.
Creating a flow
Now that we have the necessary nodes let’s create the flow.
From the palette on the right, place a yellow file inject node, an orange cocossd node, and a green debug node (which will be renamed to msg.payload when placed in the workspace) and connect the ports of each node with “wires”.
To check the image data flowing through the wire, place two image nodes (named image preview when placed on the workspace) under the flow. To output the image data from the file inject node and debug node respectively, connect to the output port, as shown in the illustration.
Only the image preview node on the right side specifies the image data variables to be displayed, so it is necessary to change the node settings. To change the settings, double-click the image preview node to open the node properties screen. On the node property screen, the image data stored in msg.payload is displayed by default. By changing this to msg.annotatedInput as shown in the screenshot below, the image preview node will display the annotated image.
Give each node an appropriate name, press the red deploy button on the upper right, and then click the button on the left side of the file inject node to upload the sample image file of the airport from your PC.
As shown, an image with orange annotation on the aircraft is displayed under the “Image with annotation” node. Also, you can see that the debug tab on the right side correctly displayed “airplane”.
Feel free to try this with images you have at your disposal and experiment with them to see if they can be recognized correctly.
About the author: Kazuhito Yokoi is an Engineer at Hitachi’s OSS Solution Center, located in Yokohama, Japan.
Once more, at The Linux Foundation‘s virtual Open Source Summit, VMware‘s Chief Open Source Officer, Dirk Hohndel, and Linux’s creator, Linus Torvalds had a wide-ranging conversation about Linux development.
The illustrious pair started with Hohndel asking about the large size of the recent Linux kernel 5.8 initial release. Hohndel wondered if it might have been so big because developers were staying home thanks to the coronavirus. Torvalds, who always worked at home, said, “I suspect 5.8 might be [so large] because of people staying inside but it might also be, it’s just happened that several different groups ended up coming at roughly the same time, with new features in 5.8.”
Harbor is an open-source cloud native registry project that stores, signs, and scans content. Harbor was created by a team of engineers at VMware China. The project was contributed to CNCF for wider adoption and contribution. Recently the project announced its 2.0 release. Swapnil Bhartiya, the founder of TFiR.io, sat down with Michael Michael, Harbor maintainer and VMware’s Director of Product Management, to talk about Harbor, community and the latest release.
Here is a lightly edited transcript of the interview:
Swapnil Bhartiya: Let’s assume that you and I are stuck in an elevator and I suddenly ask you, “What is Harbor?” So please, explain what it is.
Michael Michael: Hopefully you’re not stuck in the elevator for long; but Harbor essentially is an open source cloud-native registry. Think of this as a repository where you can store and serve all of your cloud-native assets, your container images, your Helm charts, and everything else you need to basically build cloud native applications. And then some putting posts on top of that, some very good policy engines that allow you to enforce compliance, make sure your images that you’re serving are free from vulnerabilities and making sure that you have all the guardrails in place so an operator can manage this registry and delivery it to his developers in a self-service way.
Swapnil Bhartiya: Harbor came out of VMware China. So I’m also curious that what was the problem that the team saw at that point? Because there were a lot of projects that were doing something similar, that you saw unique that Harbor was created?
Michael Michael: So essentially the need there was, there wasn’t really a good way for an enterprise to have a hosted registry that has all of the enterprise capabilities they were looking for, while at the same time being able to have full control over the registry. Like a lot of the cloud providers have their own registry implementation, there’s Docker Hub out there, or you can go and purchase something at a very expensive price point. But if you’re looking for an open source solution that gives you end to end registered capabilities, like your developers can push images and pull images, and then your operators can go and put a policy that says, Hey, I want to allow this development team to create a project, but not using more than a terabyte of storage. None of those solutions had that, so there was a need, a business need here to develop a registry. And on top of that, we realized that it wasn’t just us that had the same need, there was a lot of users and enterprises out there in the cloud native ecosystem.
Swapnil Bhartiya: The project has been out for a while and based on what you just told me, I’m curious what kind of community the product has built around itself and how the project has evolved? Because we will also talk about the new release version 2.0 but before that, I want to talk about the volitional project and the community around it.
Michael Michael: Project has evolved fairly well over the years we have increased our contributors. The contribution statistics are that CNCF is creating are showing that we’re growing our community. We now have maintainers in the project from multiple organizations and there are actually three organizations that have more than one maintainer on the project. So it’s kind of showing you that they’re, the ecosystem has picked up. We are adding more and more functionality into Harbor, and we’re also making Harbor pluggable. So there are areas of Harbor where we’re saying, Hey, here’s the default experience with Harbor, but if you want to extend the experience based on the needs of your users go ahead and do that and here’s an easy way to implement an interface and do that. That has really increased the popularity of Harbor. That means two things, we can give you a batteries-included version of Harbor from the community and then we’ll give you the option to extend that to fit the needs of your organization.
And more importantly, if you have made investments in other tooling, you can plug and play Harbor in that. When I say other tooling, I mean, things like CI/CD systems, those systems are primarily driving the development life cycle. So for example, you go from source code to container image to something that’s stored in a registry like Harbor. The engine that drives the pipeline, that workflow in a lot of ways is a CI/CD engine. So how do you integrate Harbor well with such systems? We’ve made that a reality now and that has made Harbor easier to put in an organization and get it adopted with existing standards and existing investments.
Swapnil Bhartiya: Now let’s talk about the recently announced 2.0. Talk about some of the core features, functionalities that you are excited about in this release.
Michael Michael: Absolutely, there’s like three or four features that really, really excite me. A long time coming is the support for OCI. The OCI is the Open Container Initiative and essentially it’s creating a standardized way to describe what an image looks like. And we in Harbor 2.0 we are able to announce that we have full OCI supporting Harbor. What does that mean for users? In previous releases of Harbor you could only put into Harbor two types of artifacts; a container image and a Helm chart. It satisfies a huge number of the use cases for customers, but it’s not enough in this new cloud native ecosystem, there are additional things that as a developer, as an operator, as a Kubernetes administrator, you might want to push into a repository like Harbor and have them also adopt a lot of the policy engine that Harbor provides.
Give you a few examples, single bundles, the cloud native application, a bundle. You could have OPA files, you could have singularity and other OCI compliant files. So now Harbor tells you that, Hey, you have any file type out there? If it’s OCI compliant, you can push it to Harbor, you can pull it from Harbor. And then you can add things like coders and retention policies and immutability policies and replication policies on top of that. The thing about that now, just by adding a few more types of supported artifacts into Harbor, those types immediately get to use the full benefit of Harbor in terms of our entire policy engine and the compliance that do offer to administrators of Harbor.
Swapnil Bhartiya: What does OCI compliance mean for users? Because by being compliant, you have to be more strict about what you can and cannot do. So can you talk about that? And also how does that also affect the existing users, should they have to worry about something or it doesn’t really matter?
Michael Michael: Existing users shouldn’t have to worry about this, there’s fully backward compatibility that can still push their container images, which are OCI compliant. And if you’re using a Helm Chart before, you can still push it into Charts Museum, which is a key component of Harbor, but you can now also put a Helm Chart as an OCI file. So for existing users, not much difference, backward compatibility, we still support them. The users are brothers here, we’re not going to forget them. But what it means now is actually, it’s not more strict this is a lot more open. If you’re developing artifacts that are OCI compliant and they’re following the standard way of describing an image and a standard way of actually executing an image at run time; now Kubernetes is also OCI compliant at the run time. Then you’re getting the benefits of both worlds. You get Harbor as the repository where you can store your images and you also get a run time engine that’s OCI compliant that could potentially execute them. The really great benefit here for the users.
A couple of other features that Harbor 2.0 Brings are super, super exciting. The first one is the introduction of Trivy by Aqua Security, as the batteries included built-in scanner in Harbor. Previously, we use Claire as our built-in scanner and with the release of Harbor called 1.10 that came out in December 2019, we introduced what we call a pluggable framework, think of this as a way that security vendors like Aqua and Encore can come in and create their own implementation of a security scanner to do static analysis on top of images that are deployed in Harbor.
So we still included Claire as a built-in scanner and then we added additional extension points. Now we actually liked Trivy that much our community and our users love Trivy it’s the ability to enforce and to study analysis on top of multiple operating systems on top of multiple application managers, it’s very well aligned with the vision that you have from a security standpoint in Harbor. And now we added Trivy as the built-in scanner in Harbor, we ship with it now. A great, great achievement and kudos to the Aqua team for delivering Trivy as an open source project.
Swapnil Bhartiya: That’s the question I was going to ask, but I, once again, I’ll ask the same thing again, that, what does it mean for users who were using Claire?
Michael Michael: If you’re using Claire before and you want to continue using Claire, by all means, we’re going to continue updating Claire, Claire is already included in Harbor. There’s no changes in the experience. However, if you’re thinking that Trivy is a better scanner for you, and by the way, you can use them side by side so you can compare the scanning results from each scanner. And if Trivy is a better option for you, we enabled you to make that choice. Now the way Harbor works is that you have a concept of multitenancy and we isolate a lot of the settings and the policy in the organization of images and on a per-project basis. So what does that mean? You can actually go into Harbor and you can define a project and you can say for this project I want Claire to be the built-in scanner.
And then Claire will scan all your projects in that, all the files in that project. And you can use a second project and say, well, I now want Trivy to be the scanner for this project. And then Trivy of you will scan your images. And if you have the same set of images, you can compare them and see which scanner works best based on your needs as an organization and as a user. This phenomenal, right? To give users choice and we give them all the data, but ultimately they have to make the decision on what is the best scanner for them to use based on their scenarios, the type of application images and containers that they use and the type of libraries in they use those containers.
Swapnil Bhartiya: Excellent. Before we wrap this up, what kind of roadmap you have for Harbor, of course, it’s an open source project. So there’s no such thing as when the 2.0 release is coming out. But when we look at 2020, what are the major challenges that you want to address? What are the problems you want to solve and what does the basic roadmap look like?
Michael Michael: Absolutely, I think that one of the things that we’ve been trying to do as a maintainer team for Harbor is to kind of create some themes around the release is kind of put a blueprint down in terms of what is it that we’re trying to achieve? And then identify the features that make sense in that theme. And we’re not coming up with this from a vacuum, we’re talking to users, we’re talking to other companies where we have KubeCon events in the past where we had presentations and individuals came to us asking us sets of questions. We have existing users that give us feedback. When we gather all of that, one of the things that we came up with as the next thing for our release is what you call image distribution. So we have three key features that we’re trying to tackle in that area.
The first one is how can Harbor act as a proxy cache? To enable organizations that are either deploying Kubernetes environments at the edge and they want a local Harbor instance to proxy or mirror images from the mothership like your main data center and where networking is at the premium. Maybe some of the Kubernetes nodes are not even connected to the network and they want to be a support to pull images from Harbor and then Harbor pulls the images from the upstream data center. Very, very important feature. Continuing down the path of image distribution. We’re integrating Harbor with both Dragonfly by Alibaba and Project Kraken by Uber to facilitate peer to peer distribution mechanisms for your container images. So how can we efficiently distribute images at the edge in multiple data centers in branch offices that don’t have a good network or thick network pipe between them? And how can Harbor make sure that the right images land at the right place? Big, big features that we’re trying to work with the community. And obviously we’re not doing this alone, we’re working with both Kraken and the Dragonfly communities to achieve that.
And last, the next feature that we have is what you call garbage collection without downtime. Traditionally, we do garbage collection and this is kind of the process where you get to reclaim some of the files and layers of, basically container images that are no longer in use.
Think of an organization that pushes and pulls thousands of images every day; they re-tag them, they create new versions. Sometimes you end up with layers that are no longer used, in order for those layers to be reclaimed at the storage and by the system, their registry in needs to be locked down as in nobody can be pulling or pushing images to it. In Harbor 2.0 we actually made a significant advancement where we track all the layers and the metadata of images in our database rather than depending on another tool or product to do it. So now this actually paves a road so that in the future, we could actually do garbage collection with zero downtime where Harbor can identify all the layers that are no longer in use, go reclaim them. And then that will have zero adverse impact or downtime to the users are pushing and pulling content. Huge, huge features and that’s the things that we’re working on in the future.
Swapnil Bhartiya: Awesome, thank you Michael for explaining things in detail and talking about Harbor. I look forward to talk to you again. Thank you.
Michael Michael: Absolutely. Thank you so much for the opportunity.
“This article provides an overview of the major components of a Linux system and describes the interactions between these components. It will explain terms and describe details that may seem very basic, as it doesn’t assume a lot of prior expertise.
Every Linux system has a number of major components. One of these components, the bootloader, is technically outside of Linux and often isn’t talked about. The rest of the components are all software elements that together create the full Linux system.”
Jason Perlow, Editorial Director at the Linux Foundation, had a chance to speak with NASA astronaut Christina Koch. This year, she completed a record-breaking 328 days at the International Space Station for the longest single spaceflight by a woman and participated in the first all-female spacewalk with fellow NASA astronaut Jessica Meir. Christina gave a keynote at the OpenJS Foundation’s flagship event, OpenJS World on June 24, 2020, where she shared more on how open source JavaScript and web technologies are being used in space.