Commit 
							
							Β·
						
						d8b25f9
	
1
								Parent(s):
							
							9a87acd
								
Added more explainers
Browse files
    	
        about.py
    CHANGED
    
    | @@ -42,7 +42,9 @@ Here we invite the community to submit and develop better predictors, which will | |
| 42 | 
             
            #### π Prizes
         | 
| 43 |  | 
| 44 | 
             
            For each of the 5 properties in the competition, there is a prize for the model with the highest performance for that property on the private test set.
         | 
| 45 | 
            -
            There is also an 'open-source' prize for the best model trained on the GDPa1 dataset  | 
|  | |
|  | |
| 46 | 
             
            For each of these 6 prizes, participants have the choice between
         | 
| 47 | 
             
            - **$10 000 in data generation credits** with [Ginkgo Datapoints](https://datapoints.ginkgo.bio/), or
         | 
| 48 | 
             
            - A **$2000 cash prize**.
         | 
| @@ -124,7 +126,7 @@ FAQS = { | |
| 124 | 
             
                    "No, there are no requirements to submit code / methods and submitted predictions remain private. "
         | 
| 125 | 
             
                    "We also have an optional field for including a short model description. "
         | 
| 126 | 
             
                    "Top performing participants will be requested to identify themselves at the end of the tournament. "
         | 
| 127 | 
            -
                    "There will be one prize for the best open-source model, which will require code / methods to be available."
         | 
| 128 | 
             
                ),
         | 
| 129 | 
             
                "How exactly can I evaluate my model?": (
         | 
| 130 | 
             
                    "You can easily calculate the Spearman correlation coefficient on the GDPa1 dataset yourself before uploading to the leaderboard. "
         | 
| @@ -172,25 +174,28 @@ SUBMIT_INSTRUCTIONS = f""" | |
| 172 | 
             
            You do **not** need to predict all 5 properties β each property has its own leaderboard and prize.
         | 
| 173 |  | 
| 174 | 
             
            ## Instructions
         | 
| 175 | 
            -
            1. **Upload  | 
| 176 | 
            -
               - **GDPa1 Cross-Validation predictions** (using cross-validation folds)
         | 
| 177 | 
            -
               - **Private Test Set predictions** (final test submission)
         | 
| 178 | 
             
            2. Each CSV should contain `antibody_name` + one column per property you are predicting (e.g. `"antibody_name,Titer,PR_CHO"` if your model predicts Titer and Polyreactivity).
         | 
| 179 | 
             
               - List of valid property names: `{', '.join(ASSAY_LIST)}`.
         | 
| 180 | 
            -
             | 
|  | |
| 181 |  | 
| 182 | 
             
            The GDPa1 results should appear on the leaderboard within a minute, and can also be calculated manually using average Spearman rank correlation across the 5 folds. 
         | 
| 183 |  | 
| 184 | 
             
            ## Cross-validation
         | 
| 185 |  | 
| 186 | 
            -
            For the GDPa1 cross-validation predictions | 
| 187 | 
            -
             | 
| 188 | 
            -
             | 
|  | |
|  | |
|  | |
|  | |
| 189 |  | 
| 190 | 
             
            ## Test set
         | 
| 191 |  | 
| 192 | 
            -
            The **private test set  | 
| 193 | 
            -
            ποΈ  | 
| 194 |  | 
| 195 | 
             
            Submissions close on **1 November 2025**.
         | 
| 196 | 
             
            """
         | 
|  | |
| 42 | 
             
            #### π Prizes
         | 
| 43 |  | 
| 44 | 
             
            For each of the 5 properties in the competition, there is a prize for the model with the highest performance for that property on the private test set.
         | 
| 45 | 
            +
            There is also an 'open-source' prize for the best reproducible model: one that is trained on the GDPa1 dataset (reporting cross-validation results) and assessed on the private test set where authors provide all training code and data. 
         | 
| 46 | 
            +
            This will be judged by a panel (i.e. by default the model with the highest average Spearman correlation across all properties will be selected, but a really good model on just one property may be better for the community).
         | 
| 47 | 
            +
             | 
| 48 | 
             
            For each of these 6 prizes, participants have the choice between
         | 
| 49 | 
             
            - **$10 000 in data generation credits** with [Ginkgo Datapoints](https://datapoints.ginkgo.bio/), or
         | 
| 50 | 
             
            - A **$2000 cash prize**.
         | 
|  | |
| 126 | 
             
                    "No, there are no requirements to submit code / methods and submitted predictions remain private. "
         | 
| 127 | 
             
                    "We also have an optional field for including a short model description. "
         | 
| 128 | 
             
                    "Top performing participants will be requested to identify themselves at the end of the tournament. "
         | 
| 129 | 
            +
                    "There will be one prize for the best open-source reproducible model, which will require code / methods to be available."
         | 
| 130 | 
             
                ),
         | 
| 131 | 
             
                "How exactly can I evaluate my model?": (
         | 
| 132 | 
             
                    "You can easily calculate the Spearman correlation coefficient on the GDPa1 dataset yourself before uploading to the leaderboard. "
         | 
|  | |
| 174 | 
             
            You do **not** need to predict all 5 properties β each property has its own leaderboard and prize.
         | 
| 175 |  | 
| 176 | 
             
            ## Instructions
         | 
| 177 | 
            +
            1. **Upload two CSV files**: one with GDPa1 cross-validation predictions, and one with private test set predictions
         | 
|  | |
|  | |
| 178 | 
             
            2. Each CSV should contain `antibody_name` + one column per property you are predicting (e.g. `"antibody_name,Titer,PR_CHO"` if your model predicts Titer and Polyreactivity).
         | 
| 179 | 
             
               - List of valid property names: `{', '.join(ASSAY_LIST)}`.
         | 
| 180 | 
            +
               - Include the `"hierarchical_cluster_IgG_isotype_stratified_fold"` column if submitting cross-validation predictions.
         | 
| 181 | 
            +
            3. You can resubmit as often as you like; only your latest submission will count for both the leaderboard and final test set scoring.
         | 
| 182 |  | 
| 183 | 
             
            The GDPa1 results should appear on the leaderboard within a minute, and can also be calculated manually using average Spearman rank correlation across the 5 folds. 
         | 
| 184 |  | 
| 185 | 
             
            ## Cross-validation
         | 
| 186 |  | 
| 187 | 
            +
            For the GDPa1 cross-validation predictions:
         | 
| 188 | 
            +
            1. Split the dataset using the `"hierarchical_cluster_IgG_isotype_stratified_fold"` column
         | 
| 189 | 
            +
            2. Train on 4 folds and predict on the held-out fold
         | 
| 190 | 
            +
            3. Collect held-out predictions for all 5 folds into one dataframe
         | 
| 191 | 
            +
            4. Write this dataframe to a .csv file and submit as your GDPa1 cross-validation predictions
         | 
| 192 | 
            +
             | 
| 193 | 
            +
            The leaderboard will show the average Spearman rank correlation across the 5 folds. For a code example, check out our tutorial on training an antibody developability prediction model with cross-validation [here]({TUTORIAL_URL}).
         | 
| 194 |  | 
| 195 | 
             
            ## Test set
         | 
| 196 |  | 
| 197 | 
            +
            The **private test set submissions will not be scored automatically**, to avoid test set hacking. They will be evaluated after submissions close to determine winners.
         | 
| 198 | 
            +
            ποΈ We will release one interim scoring of the latest private test set submissions on **October 13th**. Use this opportunity to see how your model is performing on the heldout test set and refine accordingly.
         | 
| 199 |  | 
| 200 | 
             
            Submissions close on **1 November 2025**.
         | 
| 201 | 
             
            """
         | 
 
			
