Spaces:
Running
Running
incredibly so
Browse files- docs/index.md +1 -1
docs/index.md
CHANGED
@@ -29,7 +29,7 @@ This new fascinating dataset just dropped on Hugging Face : [French public
|
|
29 |
|
30 |
The data is stored in 320 large parquet files. The data loader for this [Observable framework](https://observablehq.com/framework) project uses [DuckDB](https://duckdb.org/) to read these files (altogether about 200GB) and combines a minimal subset of their metadata — title and year of publication, most importantly without the text contents —, into a single highly optimized parquet file. This takes only about 1 minute to run in a hugging-face Space.
|
31 |
|
32 |
-
The resulting file is small enough (and
|
33 |
|
34 |
In this project, I’m exploring two aspects of the dataset:
|
35 |
|
|
|
29 |
|
30 |
The data is stored in 320 large parquet files. The data loader for this [Observable framework](https://observablehq.com/framework) project uses [DuckDB](https://duckdb.org/) to read these files (altogether about 200GB) and combines a minimal subset of their metadata — title and year of publication, most importantly without the text contents —, into a single highly optimized parquet file. This takes only about 1 minute to run in a hugging-face Space.
|
31 |
|
32 |
+
The resulting file is small enough (and incredibly so: the file weighs about 560kB, _only 1.5 bits per row!_), that we can load it in the browser and create “live” charts with [Observable Plot](https://observablehq.com/plot).
|
33 |
|
34 |
In this project, I’m exploring two aspects of the dataset:
|
35 |
|