Sunday, September 15, 2024

Novel Architecture Makes Neural Networks More Understandable: KANs

Recommendable!

"... An April 2024 study(opens a new tab) introduced an alternative neural network design, called a Kolmogorov-Arnold network (KAN), that is more transparent yet can also do almost everything a regular neural network can for a certain class of problems. It’s based on a mathematical idea from the mid-20th century that has been rediscovered and reconfigured for deployment in the deep learning era. ...

Yet for the past 35 years, KANs were thought to be fundamentally impractical. ...

In 1957, the mathematicians Andrey Kolmogorov ... and Vladimir Arnold ... showed — in separate though complementary papers — that if you have a single mathematical function that uses many variables, you can transform it into a combination of many functions that each have a single variable. ..."

From the abstract:
"Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs."

Novel Architecture Makes Neural Networks More Understandable | Quanta Magazine

No comments: