#+title: Requests #+begin_src bash curl -X 'GET' \ 'http://localhost:8080/clear/' \ -H 'accept: application/json' #+end_src #+RESULTS: : OK #+begin_src bash curl -X 'POST' \ 'http://localhost:8080/submit/' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "author": "ben", "content": "In literature discussing why ChatGPT is able to capture so much of our imagination, I often come across two narratives: Scale: throwing more data and compute at it. UX: moving from a prompt interface to a more natural chat interface. A narrative that is often glossed over in the demo frenzy is the incredible technical creativity that went into making models like ChatGPT work. One such cool idea is RLHF (Reinforcement Learning from Human Feedback): incorporating reinforcement learning and human feedback into NLP. RL has been notoriously difficult to work with, and therefore, mostly confined to gaming and simulated environments like Atari or MuJoCo. Just five years ago, both RL and NLP were progressing pretty much orthogonally – different stacks, different techniques, and different experimentation setups. It’s impressive to see it work in a new domain at a massive scale. So, how exactly does RLHF work? Why does it work? This post will discuss the answers to those questions." }' #+end_src #+RESULTS: : Submitted job fef72c3aa4394bc7a299291c80a5c06b #+begin_src bash curl -X 'POST' \ 'http://localhost:8080/submit/' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "author": "ben", "content": "https://en.wikipedia.org/wiki/Goulburn_Street" }' #+end_src #+RESULTS: : Submitted job f37729bb36104ab4a23cefd0480e4862 #+begin_src bash curl -X 'POST' \ 'http://localhost:8080/submit/' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "author": "ben", "content": "https://upload.wikimedia.org/wikipedia/commons/thumb/e/e1/Cattle_tyrant_%28Machetornis_rixosa%29_on_Capybara.jpg/1920px-Cattle_tyrant_%28Machetornis_rixosa%29_on_Capybara.jpg" }' #+end_src #+RESULTS: : Submitted job dc3da7b1d5aa47c38dc6713952104f5f #+begin_src bash curl -X 'GET' \ 'http://localhost:8080/check_job_status/' \ -H 'accept: application/json' #+end_src #+RESULTS: : Found 3 pending job(s): fef72c3aa4394bc7a299291c80a5c06b, f37729bb36104ab4a23cefd0480e4862, dc3da7b1d5aa47c38dc6713952104f5f #+begin_src bash curl -X 'GET' \ 'http://localhost:8080/recent/' \ -H 'accept: application/json' #+end_src #+RESULTS: | [{"id":"dc3da7b1d5aa47c38dc6713952104f5f" | author:"ben" | summary:"A small bird is perched on the back of a capy capy. It's looking for a place to nestle. It doesn't seem to be finding a suitable place for it | though | because it's not very big. The place is not very flat. " | tags:["#back" | #bird | #capy | #general | #perch | #perched] | date:"2023-05-11T13:16:48"} | {"id":"f37729bb36104ab4a23cefd0480e4862" | author:"ben" | summary:"Goulburn Street is a street in the central business district of Sydney in New South Wales | Australia. It runs from Darling Harbour and Chinatown in the west to Crown Street in the east at Darlinghurst and Surry Hills. It is the only car park operated by Sydney City Council within the CBD and was the first air rights car park in Australia." | tags:["#centralbusinessdistrict" | #darlinghurst | #general | #goulburnstreet | #surryhills | #sydney | #sydneymasoniccentre] | date:"2023-05-11T13:16:47"} | {"id":"fef72c3aa4394bc7a299291c80a5c06b" | author:"ben" | summary:"ChatGPT is able to capture our imagination because of its scale. RLHF (Reinforcement Learning from Human Feedback) is a new approach to NLP that incorporates reinforcement learning and human feedback into NLP. It's impressive to see it work in a new domain at a massive scale." | tags:["#" | #general | #rlhf] | date:"2023-05-11T13:16:45"}] | #+begin_src bash curl -X 'GET' \ 'http://localhost:8080/recent/rlhf' \ -H 'accept: application/json' #+end_src #+RESULTS: | [{"id":"fef72c3aa4394bc7a299291c80a5c06b" | author:"ben" | summary:"ChatGPT is able to capture our imagination because of its scale. RLHF (Reinforcement Learning from Human Feedback) is a new approach to NLP that incorporates reinforcement learning and human feedback into NLP. It's impressive to see it work in a new domain at a massive scale." | tags:["#" | #general | #rlhf] | date:"2023-05-11T13:16:45"}] |