Feature Store 101: Machine Learning

What is a feature store? Machine learning does not operate solely based on models. To make predictions you also need features. 

According to Wikipedia, a feature is an individual measurable property or characteristic of a phenomenon.

But this definition may seem confusing and needs to be clarified. In simple terms, a feature is a descriptive attribute that’s relevant to predicting how something will behave with the world.

Features are usually numeric and can be conveniently described by a feature vector. In other words, a feature is a set of attributes that forms an input vector for a model. 

For example, the feature may be produced from a row of CSV file.

But what is a feature in terms of layperson? 

As an example, your credit history is apparently relevant to predicting which loan will be approved for you next time while your eyes color is not. That’s why any aggregates that derived from your credit history are of interest in machine learning models. In this context, the feature can be described as a sum of all monthly payments made during the last year. It will be just a simple numeric feature. 

AI models need a lot of features to train and the more data you have – the better prediction you obtain. Features are not come from nowhere, source data have to be pulled from some storage, then features need to be computed and saved in a persistent place. That’s why we have come up to feature engineering. Feature engineering means transforming raw data into a feature vector. Usually, this process takes a significant amount of time.