One of the main techniques used in pre-training A.I. (the model) is something called RLHF or Reinforced Learning through Human Feedback. This involves interacting with the model to create conversations which are themselves the single unit of data it uses to operate with the user.
RLHF implies that a human element is always at play through the interaction with the model in pre-training. Basically, the human acts like a teacher towards the model, engaging in conversation and paying attention to mistakes, factual inaccuracies, and even lying, and reiterating the conversation via several layers of feedback (running the conversation with a different individual human to perfect the conversation for the final data to be used by the model in a production scenario).
Leave a Reply