
Tokyo-based startup Sakana AI’s Evolutionary Model Merge technique automatically generates generative models. Inspired by natural selection and building off existing models to produce more capable versions of themselves.
Sakana AI first made its presence known in August 2023. Co-founded by esteemed AI researchers such as former Googler David Ha and “Attention Is All You Need” co-author Llion Jones – both considered pioneers of current generative AI era – Sakana AI was launched by esteemed researchers such as these.
Sakana’s Evolutionary Model Merge technique can enable developers and organizations to rapidly create and discover new models cost-effectively, without incurring huge training and fine-tuning expenses.
Sakana recently unveiled two language models developed using Evolutionary Model Merge technology: an LLM and VLM.
Model merging Training generative models is often an expensive and complicated process that organizations cannot afford. But with open models like Llama 2 and Mistral now available, developers have discovered creative ways to improve them without incurring heavy costs.
One such method is “model merging,” in which components from multiple pre-trained models are combined to form one new merged one, potentially inheriting their strengths and capabilities as they were.
Merged models do not require additional training, making them extremely cost-effective. Many of the top performing models on Open LLM leaderboards are actually merged versions of popular base models, making merged versions an attractive solution.
“What we are witnessing today is a wide-ranging community of researchers, hackers, enthusiasts and artists all working on creating foundation models by fine-tuning existing models on specific datasets or merging existing ones together,” Sakana AI researchers write on their company blog.
With more than 500,000 models available on Hugging Face, model merging is an incredible resource for researchers, developers, and organizations looking to explore or create new models at low costs. However, model merging relies heavily on intuition and domain knowledge – both of which must be strong.
Evolutionary Model Merge
Sakana AI’s new evolutionary Model Merge technique seeks to offer a systematic method for discovering effective model merges.
Sakana AI’s researchers believe evolutionary algorithms, inspired by natural selection, can unlock more effective merging solutions.
Evolutionary algorithms are population-based optimization techniques modeled after biological evolution processes. They iteratively create candidate solutions by combining elements of an existing population, then selecting those with the highest fitness function scores as solutions. Evolutionary algorithms offer immense potential; exploring novel combinations that traditional methods or human intuition might miss.
David Ha, founder of Sakana AI, told VentureBeat that being able to “evolve new models with emergent capabilities from an existing variety of diverse models has significant ramifications.” With rising costs and resource requirements associated with training foundation models, large institutions or governments may consider an evolutionary approach as an alternative solution to quickly developing proof-of-concept prototype models without expending substantial capital or tapping into nation resources for entirely custom models from scratch if they even need it at all.
Sakana AI’s Evolutionary Model Merge is an evolutionary technique-powered general method for discovering optimal ways to combine various models. Instead of relying on human intuition alone, Evolutionary Model Merge automatically merges layers and weights of existing models together into new architectures to produce and evaluate innovative architectures.
Sakana writes on her blog: “By tapping into the vast pool of existing open models, our method enables automatic generation of foundation models with precisely specified user requirements,” according to our method.
With such significant progress being achieved in manually created merged models, researchers sought to discover whether an evolutionary algorithm could find even better ways of unifying open-source foundation models.
Evolutionary Model Merging was discovered as an effective and non-trivial way of unifying disparate models from different domains such as non-English language and math or non-English language and vision.