|
# deep-todo |
|
|
|
Wondering what to do? Not anymore! |
|
|
|
Generate arbitrary todo's. |
|
|
|
Source: <https://colab.research.google.com/drive/1PlKLrGHaCuvWCKNC4fmQEMElF-iRec9f?usp=sharing> |
|
|
|
The todo's come from a random selection of (public) repositories I had on my computer. |
|
|
|
|
|
### Sample |
|
|
|
A bunch of todo's: |
|
|
|
``` |
|
---------------------------------------------------------------------------------------------------- |
|
0: TODO: should we check the other edges?/ |
|
1: TODO: add more information here. |
|
2: TODO: We could also add more general functions in this case to avoid/ |
|
3: TODO: It seems strange to have the same constructor when the base set of/ |
|
4: TODO: This implementation should be simplified, as it's too complex to handle the/ |
|
5: TODO: we should be able to relax the intrinsic if not |
|
6: TODO: Make sure this doesn't go through the next generation of plugins. It would be better if this was |
|
7: TODO: There is always a small number of errors when we have this type/ |
|
8: TODO: Add support for 't' values (not 't') for all the constant types/ |
|
9: TODO: Check that we use loglef_cxx in the loop* |
|
10: TODO: Support double or double values./ |
|
11: TODO: Add tests that verify that this function does not work for all targets/ |
|
12: TODO: we'd expect the result to be identical to the same value in terms of |
|
13: TODO: We are not using a new type for 'w' as it does not denote 'y' yet, so we could/ |
|
14: TODO: if we had to find a way to extract the source file directly, we would/ |
|
15: TODO: this should fold into a flat array that would be/ |
|
16: TODO: Check if we can make it work with the correct address./ |
|
17: TODO: support v2i with V2R4+ |
|
18: TODO: Can a fast-math-flags check be generalized to all types of data? */ |
|
19: TODO: Add support for other type-specific VOPs. |
|
``` |
|
|
|
Generated by: |
|
|
|
``` |
|
tf.random.set_seed(0) |
|
|
|
sample_outputs = model.generate( |
|
input_ids, |
|
do_sample=True, |
|
max_length=40, |
|
top_k=50, |
|
top_p=0.95, |
|
num_return_sequences=20 |
|
) |
|
|
|
print("Output:\\ |
|
" + 100 * '-') |
|
for i, sample_output in enumerate(sample_outputs): |
|
m = tokenizer.decode(sample_output, skip_special_tokens=True) |
|
m = m.split("TODO")[1].strip() |
|
print("{}: TODO{}".format(i, m)) |
|
``` |
|
|
|
|
|
## TODO |
|
|
|
- [ ] Fixup the data; it seems to contain multiple todo's per line |
|
- [ ] Preprocess the data in a better way |
|
- [ ] Download github and train it on everything |