The core function of an LLM model is to determine the most likely next word. All the possible next words are selected and then a logit is generated. This word comes from logistic unit and is pronounced in English as ‘LOJ-it’. The logit is the raw unscaled value of how likely the next token will be the one we are assigning. They are unscaled at this point, but depending on the model, they tend to be somewhere between -10 and 10.

The logits are then passed through what is called a softmax function. This will make all the values positive, and the sum of all the values will be 1. This makes them probabilities. Temperature is also used in the softmax function and affects the distribution of the values.

A temperature less than 1 makes result in high logits having relatively higher probabilities and low logits having relatively lower probabilities: it spreads them further out. This results in the higher probability tokens having even higher probability of being chosen.

A temperature greater than 1 reduces the difference between logits resulting in probabilities being closer together. This means that the lower probabilities have a higher chance of being chosen than they did before.

#ollama/parameters