![Hugging Face x Scikit-learn](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hfxsklearn.png) In this sprint, we will build interactive demos from the scikit-learn documentation and, afterwards, contribute the demos directly to the docs. ## Important Dates šŸŒ… Sprint Start Date: Apr 12, 2023 šŸŒƒ Sprint Finish Date: Apr 30, 2023 ## To get started šŸ¤© 1. Join our [Discord](https://huggingface.co/join/discord) and take the role #sklearn-sprint-participant by selecting "Sklearn Working Group" in the #role-assignment channel. Then, meet us in #sklearn-sprint channel. 2. Head to [this page](https://scikit-learn.org/stable/auto_examples/) and pick an example youā€™d like to build on. 3. Leave a comment on [this spreadsheet](https://docs.google.com/spreadsheets/d/14EThtIyF4KfpU99Fm2EW3Rz9t6SSEqDyzV4jmw3fjyI/edit?usp=sharing) with your name under Owner column, claiming the example. The spreadsheet has a limited number of examples. Feel free to add yours with a comment if it doesnā€™t exist in the spreadsheet. . 4. Start building! We will be hosting our applications in [scikit-learn](https://huggingface.co/sklearn-docs) organization of Hugging Face. For complete starters: in the Hugging Face Hub, there are repositories for models, datasets, and [Spaces](https://huggingface.co/spaces). Spaces are a special type of repository hosting ML applications, such as showcasing a model. To write our apps, we will only be using Gradio. [Gradio](https://gradio.app/) is a library that lets you build a cool front-end application for your models, completely in Python, and supports many libraries! In this sprint, we will be using mostly visualization support (`matplotlib`, `plotly`, `altair` and more) and [skops](https://skops.readthedocs.io/en/stable/) integration (which you can launch an interface for a given classification or regression interface with one line of code). In Gradio, there are two ways to create a demo. One is to use `Interface`, which is a very simple abstraction. Letā€™s see an example. ```python import gradio as gr # implement your classifier here clf.fit(X_train, y_train) def cancer_classifier(df): # simply infer and return predictions predictions = clf.predict(df) return predictions gr.Interface(fn=cancer_classifier, inputs="dataframe", outputs="label").launch() # save this in a file called app.py # then run it ``` This will result in following interface: ![Simple Interface](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/interface.png) This is very customizable. You can specify rows and columns, add a title and description, an example input, and more. Thereā€™s a more detailed guide [here](https://gradio.app/using-gradio-for-tabular-workflows/). Another way of creating an application is to use [Blocks](https://gradio.app/quickstart/#blocks-more-flexibility-and-control). You can see usage of Blocks in the example applications linked in this guide. After we create our application, we will create a Space. You can go to [hf.co](http://huggingface.co), click on your profile on top right and select ā€œNew Spaceā€. ![New Space](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/new_space.png) We can name our Space, pick a license and select Space SDK as ā€œGradioā€. Free hardware is enough for our app, so no need to change it. ![Space Configuration](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_config.png) After creating the Space, you have three options * You can clone the repository locally, add your files, and then push them to the Hub. * You can do all your coding directly in the browser. * (shown below) You can do the coding locally and then drag and drop your application file to the Hub. ![Space Config](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_config.png) To upload your application file, pick ā€œAdd Fileā€ and drag and drop your file. ![New Space Landing](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_landing.png) Lastly, if your application includes any library other than Gradio, create a file called requirements.txt and add requirements like below: ```python matplotlib==3.6.3 scikit-learn==1.2.1 ``` And your app should be up and running! **Example Submissions** We left couple of examples below: (thereā€™s more at the end of this page) Documentation page for comparing linkage methods for hierarchical clustering and example Space built on it šŸ‘‡šŸ¼ [Comparing different hierarchical linkage methods on toy datasets](https://scikit-learn.org/stable/auto_examples/cluster/plot_linkage_comparison.html#sphx-glr-auto-examples-cluster-plot-linkage-comparison-py) [Hierarchical Clustering Linkage - a Hugging Face Space by scikit-learn](https://huggingface.co/spaces/scikit-learn/hierarchical-clustering-linkage) Note: If for your demo you're training a model from scratch (e.g. training an image classifier), you can push it to the Hub using [skops](https://skops.readthedocs.io/en/stable/) and build a Gradio demo on top of it. For such submission, we expect a model repository with a model card and the model weight as well as a simple Space with the interface that receives input and outputs results. You can use this tutorial to get started with [skops](https://www.kdnuggets.com/2023/02/skops-new-library-improve-scikitlearn-production.html). You can find an example submission for a model repository below. [scikit-learn/cancer-prediction-trees Ā· Hugging Face](https://huggingface.co/scikit-learn/cancer-prediction-trees) 4. After the demos are done, we will open pull requests to scikit-learn documentation in [scikit-learnā€™s repository](https://github.com/scikit-learn/scikit-learn) to contribute our application codes to be directly inside the documentation. We will help you out if this is your first open source contribution. šŸ¤—Ā  **If you need any help** you can join our discord server, take collaborate role and join `sklearn-sprint` channel and ask questions šŸ¤—šŸ«‚ ### Sprint Prizes We will be giving following vouchers that can be spent at [Hugging Face Store](https://store.huggingface.co/) including shipping, - $20 worth of voucher for everyone that builds three demos, - $40 worth of voucher for everyone that builds five demos.