Spaces:

observablehq
/

fpdn

Running

fil commited on Feb 18, 2024

Commit

8ea911d

1 Parent(s): 77379da

incredibly so

Files changed (1) hide show

docs/index.md CHANGED Viewed

@@ -29,7 +29,7 @@ This new fascinating dataset just dropped on Hugging Face&nbsp;: [French public
 The data is stored in 320 large parquet files. The data loader for this [Observable framework](https://observablehq.com/framework) project uses [DuckDB](https://duckdb.org/) to read these files (altogether about 200GB) and combines a minimal subset of their metadata — title and year of publication, most importantly without the text contents&nbsp;—, into a single highly optimized parquet file. This takes only about 1 minute to run in a hugging-face Space.
-The resulting file is small enough (and almost incredibly small: about 2.5MB, _less than 1 byte per row!_), that we can load it in the browser and create “live” charts with [Observable Plot](https://observablehq.com/plot).
 In this project, I’m exploring two aspects of the dataset:

 The data is stored in 320 large parquet files. The data loader for this [Observable framework](https://observablehq.com/framework) project uses [DuckDB](https://duckdb.org/) to read these files (altogether about 200GB) and combines a minimal subset of their metadata — title and year of publication, most importantly without the text contents&nbsp;—, into a single highly optimized parquet file. This takes only about 1 minute to run in a hugging-face Space.
+The resulting file is small enough (and incredibly so: the file weighs about 560kB, _only 1.5 bits per row!_), that we can load it in the browser and create “live” charts with [Observable Plot](https://observablehq.com/plot).
 In this project, I’m exploring two aspects of the dataset: