silky
/

deep-todo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

deep-todo / README.md

silky's picture

Update README.md

4cbce34 about 3 years ago

|

history blame contribute delete

No virus

2.33 kB

	# deep-todo

	Wondering what to do? Not anymore!

	Generate arbitrary todo's.

	Source: <https://colab.research.google.com/drive/1PlKLrGHaCuvWCKNC4fmQEMElF-iRec9f?usp=sharing>

	The todo's come from a random selection of (public) repositories I had on my computer.


	### Sample

	A bunch of todo's:

	```
	----------------------------------------------------------------------------------------------------
	0: TODO: should we check the other edges?/
	1: TODO: add more information here.
	2: TODO: We could also add more general functions in this case to avoid/
	3: TODO: It seems strange to have the same constructor when the base set of/
	4: TODO: This implementation should be simplified, as it's too complex to handle the/
	5: TODO: we should be able to relax the intrinsic if not
	6: TODO: Make sure this doesn't go through the next generation of plugins. It would be better if this was
	7: TODO: There is always a small number of errors when we have this type/
	8: TODO: Add support for 't' values (not 't') for all the constant types/
	9: TODO: Check that we use loglef_cxx in the loop*
	10: TODO: Support double or double values./
	11: TODO: Add tests that verify that this function does not work for all targets/
	12: TODO: we'd expect the result to be identical to the same value in terms of
	13: TODO: We are not using a new type for 'w' as it does not denote 'y' yet, so we could/
	14: TODO: if we had to find a way to extract the source file directly, we would/
	15: TODO: this should fold into a flat array that would be/
	16: TODO: Check if we can make it work with the correct address./
	17: TODO: support v2i with V2R4+
	18: TODO: Can a fast-math-flags check be generalized to all types of data? */
	19: TODO: Add support for other type-specific VOPs.
	```

	Generated by:

	```
	tf.random.set_seed(0)

	sample_outputs = model.generate(
	input_ids,
	do_sample=True,
	max_length=40,
	top_k=50,
	top_p=0.95,
	num_return_sequences=20
	)

	print("Output:\\
	" + 100 * '-')
	for i, sample_output in enumerate(sample_outputs):
	m = tokenizer.decode(sample_output, skip_special_tokens=True)
	m = m.split("TODO")[1].strip()
	print("{}: TODO{}".format(i, m))
	```


	## TODO

	- [ ] Fixup the data; it seems to contain multiple todo's per line
	- [ ] Preprocess the data in a better way
	- [ ] Download github and train it on everything