website

Last update

October 11, 2021

Licence type

AGPLv3 license

Supported Languages

Python, PySpark, Spark, Flink

Supported Platforms

Debian/Ubuntu, Redhat/Centos
online-offline
open-source
on-premises
managed-cloud

description

⚠️ New, details subject to change.

The Hopsworks Feature Store is an open-source and managed service that connects to the widest number of data sources and supports feature computations in any Spark or Python environment.
Core Features
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage
Governance
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage
Monitoring
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage
User Experience
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage
Feature Computation
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage
Data Ingestion
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage
Feature Storage
Access control
Feature data validation
Feature Registry/Search
Offline Feature Store
Online Feature Store
CI/CD Support
Custom Metadata
Single Sign-On
Feature Statistics
Ingestion from Streaming Sources
Time Travel
Web UI
3rd Party Orchestration
Feature Schema Versioning
Feature Visualization
Ingestion from Batch Sources
OAuth-2
Python
PySpark
Spark
Spark Streaming
SQL
Training Data File Formats
External feature group
Flink
Point-in-Time JOINs
Training dataset
Lineage

Other details

Core Build

RonDB, HopsFS, S3, Kafka, Opendistro (Elastic)

Core APIs

Python, PySpark, Spark, SQL-Spark

Environement

Data Sources

Platform integrations

Architecture

Hopsworks seems to be a complete pipeline with a Feature Store at the core, we have seen functions as model registry and some serving functionalities on the platform.