, I don’t think we should put all our eggs in the basket of building completely accurate and self-aware AI models. In fact, I don’t even think these are practical and beneficial goals. (View Highlight)
e should teach users of AIs that there is no “absolute truth” in anything they get from an LLM. In fact, there is no absolute truth in anything a user gets from a search engine either. Or… from a medical doctor! (View Highlight)
doctors not only have a horrible diagnostic accuracy. They also have no clue on when they might be wrong. (View Highlight)
In fact LLMs allow for many different variations of “second opinions”. In the simplest one, you can simply ask the SAME LLM the same question several times and get a sense of the variability and then make up your mind. A bit more involved approach requires not only variability in the response of the same model, but asking different models, which can then be combined using some ensemble technique (with the simplest being majority voting). (View Highlight)
While the solution outlined below is readily available, generalized, and scales well, it cannot be implemented unless it comes with a great deal of effort in user messaging. (View Highlight)
Chatbots based on LLMs should be explicit about their stochasticity and constantly invite the user to try again, reformulate their question, and do their research. (View Highlight)
LLMs fail to be accurate and factual while conveying a high degree of confidence. (View Highlight)