demo-curation / record.md
tianhaowang's picture
fix error
56b9da8
## Progress Update
- Added `utils/config.py` with `SpaceConfig` dataclass to centralize environment handling.
- Updated `app.py` to use the configuration helper and surface configuration status in the temporary Gradio interface.
- Verified modules compile cleanly via `python -m compileall`.
- Added starter catalog at `catalog/candidates.json` and utility modules (`utils/data.py`, `utils/plotting.py`, `utils/hub.py`).
- Implemented job scripts scaffold under `jobs/` including `run_experiment.py`, `train.py`, `eval.py`, and `scaling.py`.
- Replaced `app.py` with an initial Gradio Blocks interface that wires configuration, catalog loading, job submission, and polling against Hugging Face Jobs.
- Expanded dataset utilities for normalization/mixing, implemented PEFT-ready classification training, evaluation with weighted F1, scaling-law plotting, and ensured artifacts push via results repo.
- Implemented extractive QA training/evaluation: added QA tokenization/post-processing helpers, expanded `jobs/train.py` for PEFT QA fine-tunes, and updated `jobs/eval.py` to compute Exact Match/F1 via squad metric.
## Bug-1
```bash
===== Application Startup at 2025-09-27 23:59:16 =====
********************************************
* Trying to transpile functions from Python -> JS for performance
* (1/1) <lambda>: ✅
********************************************
* Running on local URL: http://0.0.0.0:7860, with SSR ⚡ (experimental, to disable set `ssr_mode=False` in `launch()`)
* To create a public link, set `share=True` in `launch()`.
=== Application restarted at 2025-09-28 00:37:35.548856498 UTC ===
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1052: UserWarning: Expected 8 arguments for function <function submit_experiments at 0x7f585c7ed2d0>, received 10.
warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1060: UserWarning: Expected maximum 8 arguments for function <function submit_experiments at 0x7f585c7ed2d0>, received 10.
warnings.warn(
Traceback (most recent call last):
File "/home/user/app/app.py", line 221, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 197, in build_interface
run_btn.click(
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 746, in event_trigger
return Dependency(block, dep.get_config(), dep_index, fn)
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in get_config
"inputs": [block._id for block in self.inputs],
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in <listcomp>
"inputs": [block._id for block in self.inputs],
AttributeError: type object 'OAuthProfile' has no attribute '_id'
=== Application stopped (exit code: 1) at 2025-09-28 00:37:41.809460721 UTC ===
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1052: UserWarning: Expected 8 arguments for function <function submit_experiments at 0x7f3a177ba320>, received 10.
warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1060: UserWarning: Expected maximum 8 arguments for function <function submit_experiments at 0x7f3a177ba320>, received 10.
warnings.warn(
Traceback (most recent call last):
File "/home/user/app/app.py", line 221, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 197, in build_interface
run_btn.click(
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 746, in event_trigger
return Dependency(block, dep.get_config(), dep_index, fn)
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in get_config
"inputs": [block._id for block in self.inputs],
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in <listcomp>
"inputs": [block._id for block in self.inputs],
AttributeError: type object 'OAuthProfile' has no attribute '_id'
=== Application stopped (exit code: 1) at 2025-09-28 00:47:47.132882448 UTC ===
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1052: UserWarning: Expected 8 arguments for function <function submit_experiments at 0x7f9c55bc23b0>, received 10.
warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1060: UserWarning: Expected maximum 8 arguments for function <function submit_experiments at 0x7f9c55bc23b0>, received 10.
warnings.warn(
Traceback (most recent call last):
File "/home/user/app/app.py", line 221, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 197, in build_interface
run_btn.click(
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 746, in event_trigger
return Dependency(block, dep.get_config(), dep_index, fn)
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in get_config
"inputs": [block._id for block in self.inputs],
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in <listcomp>
"inputs": [block._id for block in self.inputs],
AttributeError: type object 'OAuthProfile' has no attribute '_id'
=== Application stopped (exit code: 1) at 2025-09-28 01:09:40.504908107 UTC ===
```
- Added pretraining task support: data utilities handle text corpora/test datasets, training/eval scripts run causal LM fine-tunes with perplexity metrics, and `run_experiment.py` threads optional test datasets through summary/scaling.
- Expanded Gradio UI to include pretraining selection, dynamic metric/candidate options, and optional test dataset uploads or Hub references.
## Bug-2
```bash
===== Application Startup at 2025-09-27 23:59:16 =====
********************************************
* Trying to transpile functions from Python -> JS for performance
* (1/1) <lambda>: ✅
********************************************
* Running on local URL: http://0.0.0.0:7860, with SSR ⚡ (experimental, to disable set `ssr_mode=False` in `launch()`)
* To create a public link, set `share=True` in `launch()`.
=== Application restarted at 2025-09-28 00:37:35.548856498 UTC ===
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1052: UserWarning: Expected 8 arguments for function <function submit_experiments at 0x7f585c7ed2d0>, received 10.
warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1060: UserWarning: Expected maximum 8 arguments for function <function submit_experiments at 0x7f585c7ed2d0>, received 10.
warnings.warn(
Traceback (most recent call last):
File "/home/user/app/app.py", line 221, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 197, in build_interface
run_btn.click(
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 746, in event_trigger
return Dependency(block, dep.get_config(), dep_index, fn)
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in get_config
"inputs": [block._id for block in self.inputs],
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in <listcomp>
"inputs": [block._id for block in self.inputs],
AttributeError: type object 'OAuthProfile' has no attribute '_id'
=== Application stopped (exit code: 1) at 2025-09-28 00:37:41.809460721 UTC ===
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1052: UserWarning: Expected 8 arguments for function <function submit_experiments at 0x7f3a177ba320>, received 10.
warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1060: UserWarning: Expected maximum 8 arguments for function <function submit_experiments at 0x7f3a177ba320>, received 10.
warnings.warn(
Traceback (most recent call last):
File "/home/user/app/app.py", line 221, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 197, in build_interface
run_btn.click(
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 746, in event_trigger
return Dependency(block, dep.get_config(), dep_index, fn)
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in get_config
"inputs": [block._id for block in self.inputs],
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in <listcomp>
"inputs": [block._id for block in self.inputs],
AttributeError: type object 'OAuthProfile' has no attribute '_id'
=== Application stopped (exit code: 1) at 2025-09-28 00:47:47.132882448 UTC ===
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1052: UserWarning: Expected 8 arguments for function <function submit_experiments at 0x7f9c55bc23b0>, received 10.
warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/utils.py:1060: UserWarning: Expected maximum 8 arguments for function <function submit_experiments at 0x7f9c55bc23b0>, received 10.
warnings.warn(
Traceback (most recent call last):
File "/home/user/app/app.py", line 221, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 197, in build_interface
run_btn.click(
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 746, in event_trigger
return Dependency(block, dep.get_config(), dep_index, fn)
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in get_config
"inputs": [block._id for block in self.inputs],
File "/usr/local/lib/python3.10/site-packages/gradio/block_function.py", line 142, in <listcomp>
"inputs": [block._id for block in self.inputs],
AttributeError: type object 'OAuthProfile' has no attribute '_id'
=== Application stopped (exit code: 1) at 2025-09-28 01:09:40.504908107 UTC ===
* Running on local URL: http://0.0.0.0:7860, with SSR ⚡ (experimental, to disable set `ssr_mode=False` in `launch()`)
* To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 745, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 354, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2116, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1623, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 976, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 915, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 72, in submit_experiments
d0_repo = ensure_uploaded_dataset(
File "/home/user/app/utils/hub.py", line 22, in ensure_uploaded_dataset
raise ValueError("Please upload D₀ or provide a Hub dataset id.")
ValueError: Please upload D₀ or provide a Hub dataset id.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 745, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 354, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2116, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1623, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 976, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 915, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 72, in submit_experiments
d0_repo = ensure_uploaded_dataset(
File "/home/user/app/utils/hub.py", line 22, in ensure_uploaded_dataset
raise ValueError("Please upload D₀ or provide a Hub dataset id.")
ValueError: Please upload D₀ or provide a Hub dataset id.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 745, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 354, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2116, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1623, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 976, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 915, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 72, in submit_experiments
d0_repo = ensure_uploaded_dataset(
File "/home/user/app/utils/hub.py", line 22, in ensure_uploaded_dataset
raise ValueError("Please upload D₀ or provide a Hub dataset id.")
ValueError: Please upload D₀ or provide a Hub dataset id.
SPACE_HOST: tianhaowang-demo-curation.hf.space
SPACE_HOST after split: tianhaowang-demo-curation.hf.space
Redirect URI: https://tianhaowang-demo-curation.hf.space/login/callback?_target_url=%2F%3F__theme%3Dsystem
=== Application restarted at 2025-09-28 02:22:42.150125320 UTC ===
Traceback (most recent call last):
File "/home/user/app/app.py", line 272, in <module>
demo = build_interface()
File "/home/user/app/app.py", line 246, in build_interface
task.change(fn=on_task_change, inputs=task, outputs=[metrics, dk])
File "/usr/local/lib/python3.10/site-packages/gradio/events.py", line 703, in event_trigger
dep, dep_index = root_block.set_event_trigger(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 715, in set_event_trigger
check_function_inputs_match(fn, inputs, inputs_as_dict)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 1035, in check_function_inputs_match
parameter_types = get_type_hints(fn)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 999, in get_type_hints
return typing.get_type_hints(fn)
File "/usr/local/lib/python3.10/typing.py", line 1871, in get_type_hints
value = _eval_type(value, globalns, localns)
File "/usr/local/lib/python3.10/typing.py", line 327, in _eval_type
return t._evaluate(globalns, localns, recursive_guard)
File "/usr/local/lib/python3.10/typing.py", line 694, in _evaluate
eval(self.__forward_code__, globalns, localns),
File "<string>", line 1, in <module>
AttributeError: type object 'CheckboxGroup' has no attribute 'update'
=== Application stopped (exit code: 1) at 2025-09-28 02:22:46.235131589 UTC ===
```- Updated `jobs/train.py` to align with Transformers 4.56 (`eval_strategy`) and executed a local pretraining smoke test using `sshleifer/tiny-gpt2` on synthetic datasets (`tmp/pretrain_smoke/*`), stubbing artifact upload.
## Debug Notes - Run button disabled
- Observed report: UI shows uploaded D₀/test data and `c4` selected under Pretraining, yet `Run experiments` remains disabled.
- Hypothesis: event wiring may silently fail when OAuth profile/token aren't injected; the button stays greyed out if Gradio cannot resolve the click handler.
- Need to confirm whether OAuth login succeeded and whether the browser console reports component wiring errors.
### What would help
- After logging in, open browser dev tools → Console tab and try clicking the button; copy any errors (especially ones mentioning `submit_experiments` or missing inputs).
- In the Space, click the login button and confirm it shows your username; if it keeps prompting, note the behaviour.
- If possible, run the Space in dev mode with `?__theme=dark&__debug=true` and grab the `gradio_config` snippet for the `Run experiments` dependency from the network panel (to verify the handler attached).
- Chrome console steps: press `Ctrl+Shift+I` (or `Cmd+Option+I` on macOS) to open DevTools, switch to the **Console** tab, click "Run experiments", then copy any red error lines that appear.
- Chrome config capture: open the Space URL with `?__debug=true` added (e.g. `https://.../?__debug=true`), open DevTools → **Network**, filter for `config`, click the latest response, and copy the JSON snippet under the `dependencies` entry for the run button.
- Chrome console shows mixed-content warning on sign-in (expected during dev restart) and 404s for `/gradio_api/upload_progress?upload_id=undefined` when uploading files; no console output on clicking Run, suggesting the click handler never fires.
- Updated `submit_experiments` signature to default the OAuth inputs to `None`; Gradio now calls it successfully even if the login component isn’t wired in dev mode.