A new method can test whether a large language model contains hidden biases, personalities, moods, or other abstract concepts. MIT researchers can zero in on connections within a model that encode for a concept of interest, to improve LLM safety and performance.