Apple quietly launched an open-source multimodal LLM called Ferret

kac77

2[H]4U
Joined
Dec 13, 2008
Messages
3,319

Apple quietly launched an open-source multimodal LLM called Ferret


BY MIKE WHEATLEY

Artificial intelligence researchers from Apple Inc. and Cornell University quietly unveiled an open-source and multimodal large language model last October known as Ferret, which is said to use parts of images as queries.

According to VentureBeat, the release of Ferret on GitHub in October went completely under the radar, with no announcement being made. However, it has since gotten a lot of attention from AI researchers. Bart De Witte, who operates a non-profit focused on open-source AI in medicine, posted on X that the release of Ferret “solidifies Apple’s place as a leader in the multimodal AI space.”
 

Apple quietly launched an open-source multimodal LLM called Ferret


BY MIKE WHEATLEY

Artificial intelligence researchers from Apple Inc. and Cornell University quietly unveiled an open-source and multimodal large language model last October known as Ferret, which is said to use parts of images as queries.

According to VentureBeat, the release of Ferret on GitHub in October went completely under the radar, with no announcement being made. However, it has since gotten a lot of attention from AI researchers. Bart De Witte, who operates a non-profit focused on open-source AI in medicine, posted on X that the release of Ferret “solidifies Apple’s place as a leader in the multimodal AI space.”
Apple also developed a way to piece-load large LLMs from NVME, into a limited RAM set without introducing so much latency that it becomes useless.|
https://arxiv.org/abs/2312.11514
https://bdtechtalks.com/2023/12/27/apple-llm-flash-research/

“We have demonstrated the ability to run LLMs up to twice the size of available DRAM, achieving an acceleration in inference speed by 4-5x compared to traditional loading methods in CPU, and 20-25x in GPU,” the researchers write
 
Apple also developed a way to piece-load large LLMs from NVME, into a limited RAM set without introducing so much latency that it becomes useless.|
https://arxiv.org/abs/2312.11514
https://bdtechtalks.com/2023/12/27/apple-llm-flash-research/

“We have demonstrated the ability to run LLMs up to twice the size of available DRAM, achieving an acceleration in inference speed by 4-5x compared to traditional loading methods in CPU, and 20-25x in GPU,” the researchers write
Here’s another article talking about the same thing for us dummies. It’s how Apple plans to run localized LLMs on iPhone which have limited RAM. Ostensibly to finally give Siri a real overhaul, which as we all know it desperately needs amongst whatever LLM's they want to build.

https://www.extremetech.com/mobile/apple-figures-out-how-to-run-larger-ai-models-on-a-phone?
 
Last edited:
Back
Top