Currently, Hopsworks, Feast, and Iguazio are the only feature stores available as on-premises products. Hopsworks is available both as open-source and Enterprise versions. Open-source Hopsworks is fully functional, but the Enterprise version contains additional support for Active Directory and OAuth-2 SSO, as well as integration with Kubernetes. Both Feast and Iguazio need an existing Kubernetes cluster to be deployed on. Feast is open-source, but Iguazio is an enterprise-only platform.
Many existing companies who are using on-premises feature stores have deployed a data lake and want to add the feature store as their "data warehouse for machine learning". Hopsworks can be integrated with Cloudera by providing support for AD/Kerberos SSO. You can do your feature engineering in your Cloudera cluster and ingest them from Spark jobs running on Cloudera. You can even do your feature engineering in Python programs running on Jupyter notebooks or IDEs or gateway servers and ingest the features to Hopsworks. Similarly, you can create training data from any external Spark or Python environment. Hopsworks comes with its own Spark and Python execution enviornments (you can even run Jupyter notebooks as Jobs on Hopsworks, scheduled by Airflow pipelines). For the online feature store, it can be accessed by an external client that can communicate with JDBC. There are also Python and Scala/Java APIs to the online feature store.
If you have an existing Kubernetes cluster and want to deploy Kubeflow with an open-source feature store, Feast might be the platform for you. Feast comes with helm charts for installation of the services it requires (Postgres and Redis). Feast does not come with an offline feature store, so you will have to make sure you have an object store or distributed file system or data warehouse to manage your offline features.
If you have an existing Kubernetes cluster and want to deploy Kubeflow with an Enterprise feature store, Iguazio might be the platform for you. Its feature store is a very new platform (released in early 2021) and it lacks some functionality, but it is catching up. Iguazio sells its own version of Kubeflow, called MLRun, with the addition of the feature store. MLRun is open-source and it makes Kubeflow easier to use with a vidual ML pipeline platform - where you define feature pipelines, model training, and model serving. Kubeflow supports feature engineering in Python with Kubeflow pipelines, model training, and model serving with KFserving. It also includes serverless model serving support with Nucleo, Iguazio's own serverless lambdas for Kubernetes.