Get a daily digest of the latest news in tech, science, and technology, delivered right to your mailbox. Subscribe now.
Fortunately, artificial neural networks are more accessible to study than biological ones. We can measure the activity of every neuron in the network, manipulate them by turning them on or off, and see how the network responds to different inputs. The features also allow them to control the network's behavior more precisely. As shown below, by activating a feature artificially, they can make the network produce different outputs that match the feature's meaning.
This work results from Anthropic's investment in Mechanistic Interpretability – one of their longest-term research bets on AI safety. Until now, the fact that individual neurons were uninterpretable presented a severe roadblock to a mechanistic understanding of language models. Decomposing groups of neurons into interpretable features has the potential to move past that roadblock.
Technology Technology Latest News, Technology Technology Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Source: IntEngineering - 🏆 287. / 63 Read more »
Source: IntEngineering - 🏆 287. / 63 Read more »
Source: IntEngineering - 🏆 287. / 63 Read more »
Source: IntEngineering - 🏆 287. / 63 Read more »
Source: IntEngineering - 🏆 287. / 63 Read more »
Source: IntEngineering - 🏆 287. / 63 Read more »