Facets: An Open Source Visualization Tool for Machine Learning Training Data

July 18, 2017

337

Getting the best results out of a machine learning (ML) model requires that you truly understand your data. However, ML datasets can contain hundreds of millions of data points, each consisting of hundreds (or even thousands) of features, making it nearly impossible to understand an entire dataset in an intuitive fashion. Visualization can help unlock nuances and insights in large datasets. A picture may be worth a thousand words, but an interactive visualization can be worth even more.

Working with the PAIR initiative, we’ve released Facets, an open source visualization tool to aid in understanding and analyzing ML datasets. Facets consists of two visualizations that allow users to see a holistic picture of their data at different granularities. Get a sense of the shape of each feature of the data using Facets Overview, or explore a set of individual observations using Facets Dive. These visualizations allow you to debug your data which, in machine learning, is as important as debugging your model. They can easily be used inside of Jupyter notebooks or embedded into webpages. In addition to the open source code, we’ve also created a Facets demo website. This website allows anyone to visualize their own datasets directly in the browser without the need for any software installation or setup, without the data ever leaving your computer.

RELATED ARTICLESMORE FROM AUTHOR

Xen 4.19 is released

Advancing Xen on RISC-V: key updates

AI Produces Data-driven OpenFOAM Speedup (HPC Wire)

Delivering Prime Training Deals – 2 DAYS ONLY

Why You Need to Know About Event Modeling: —An Intro

RELATED ARTICLES MORE FROM AUTHOR