Post
761
tldr; Parquet is awesome, DuckDB too!
Datasets on the Hugging Face Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.
blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend
Datasets on the Hugging Face Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.
blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend