ML.NET Concepts – Data

Hi Guys! That’s the first of articles series where i write about ML.NET framework

Today i write about Data

Data

Data is rappresented by IDataView interface. If you think about Data you can imagine it like a Sql View.

Data is a lazily-evaluated, cursorable, heterogenous, schematized dataset.

Data have two instances:

Schema

A instance of ISchema interface. It contains al the information about the Data view column.

Every column have its name, its type and its metadata.

Use vector<T,N> where T is type and N is size of Data for the rappresentation of the data associated to a row

Cursors

The data view are source of Cursors. Think about Cursors as cursors SQL: a cursors is an object that flowing in the data, one row at time and show data available.

We can have infinite Cursors, because data are immutable.

Nota bene: usually the cursors access a subset of columns, where for efficiency the columns that do not serve the cursor are not calculated.

Data are Lazy!

 

 

References:

ML.NET Tutorial

IDataView Design Principles

Cursori Transact-SQL