top of page

Layla v5.1.0 has been published!

Writer's picture: LaylaLayla

We're excited to announce significant updates to Layla, bringing powerful new capabilities and improvements across the board. This release focuses on expanding hardware support, enhancing the user interface, and fixing several important issues to provide a more robust experience.


Important change in this version

ARM quants have now been consolidated into Q4_0 in this version.


Previously, you were able to choose 3 different ARM quants that runs on mobile hardware:

  • Q4_0_4_4

  • Q4_0_4_8

  • Q4_0_8_8


In this version, all 3 have been consolidated into just one Q4_0. This means your previous models will not work anymore, and you should download the new quant from our official Hugging Face repository: https://huggingface.co/l3utterfly


Layla supports GPU inference for LLMs!

One of the most significant additions in this update is GPU inference support for LLMs, with compatibility for both Vulkan and OpenCL backends. GPU support offloads the inference to your phone's dedicated graphics hardware.


Note that this is not expected to give dramatically faster response times, since mobile GPUs are not as powerful as desktop ones. GPU inference can relieve the CPU for performing other tasks such as background apps, or long-term memory, providing a more consistent experience.


OpenCL works better for Adreno GPUs, while Vulkan works for a wider variety of phone hardware.

Glowing microchip on a blue circuit board with illuminated traces, emitting a techy futuristic vibe. Central icon in white.

Dramatic speed-up of image generation via the NPU!

We've also introduced NPU inference support for Stable Diffusion, marking a major step forward in processing capabilities.


NPU (Neural processing units) are special processors to run neural networks (AI). Layla implements Qualcomm AI Engine to offload Stable Diffusion models to their dedicated hardware called the HTP (Hexagon Tensor Processor):

- generates an image in ~10 seconds!

- negligible RAM usage

- low power consumption


In the Stable Diffusion mini-app, choose models tagged with "qnn", featuring the Qualcomm NPU icon at the top right corner.


IMPORTANT: Only the following chipsets are supported: Snapdragon8 Gen2, Snapdragon8 Gen3, Snapdragon Elite


Note: the actual image generation takes less than a second (~100ms per iteration), but loading the model from disk, copying it to the NPU etc. takes a few seconds. So in real world usage, the full process takes about 10 seconds. In Qualcomm's advertisement video, the model is pre-loaded


Improved User Experience


We've made substantial improvements to the user interface, starting with a redesigned Lorebook UI that better handles large document collections. The model import interface has been refined for greater clarity and ease of use. We've also enhanced the Long-term Memory feature by adding timestamps to the table view, making it easier to track and manage your conversation history.


The backup process has been streamlined, now allowing direct selection of save folders. We've introduced a new Download Manager app that provides visibility and control over download tasks, including the ability to cancel stuck downloads. The chat interface has been enhanced with redesigned quick actions, featuring an always-visible copy button and a new context menu accessible via tap and hold.


Expanded Model Support


This update brings several new model options to Layla:

- Addition of Whisper Base and Whisper Base (English) models with configurable language detection

- Support for the sherpa-onnx TTS engine APK

- Automatic conversion of Q4_0 quantizations to match your current architecture


Enhanced Creation Tools


Character creation has been improved with direct TavernPNG saving to the file system. The AI-powered character image generation now utilizes the default negative prompt configured in the SD mini-app, ensuring more consistent results.


Bug Fixes and Stability Improvements


We've addressed several important issues to improve stability and reliability:

- Resolved issues with chat history imports

- Fixed Layla Cloud's handling of extensive conversation histories

- Corrected memory ingestion failures caused by single memory errors

- Improved the chat interface to prevent quick actions from overwhelming the screen

- Fixed character response styling to properly reflect chat accent colors

- Resolved issues with default character image generation fallback phrases


The full changelog can be found here


New features:

  • Layla supports GPU inference! Supports Vulkan and OpenCL backends

  • Layla supports NPU inference for Stable Diffusion!

  • Layla supports reasoning models Deepseek R1 family!


Improvements:

  • redesigned Lorebook UI to handle lots of documents better

  • improved UI of model import

  • added timestamps to Long-term Memory table view

  • backup data now directly allows you to choose a folder to save to

  • added a Download Manager app to give the ability to view/cancel download tasks in case they get stuck

  • added Whisper Base and Whisper Base (English) models

  • added ability to configure the language Whisper models listen in

  • Q4_0 quants are now automatically converted on the fly to support your current architecture

  • allows saving TavernPNG directly to file system in character creation

  • supports sherpa-onnx TTS engine APK

  • redesigned chat message quick actions (copy button is now always visible, tap & hold the message to bring up a context menu with more action)

  • Create Character (AI) image generation now uses the default negative prompt configured in the SD mini-app


Bug fixes:

  • fixed bug when importing chat history

  • fixed bug in Layla Cloud when handling very long conversation histories

  • fixed bug where an error in one memory will stop ingestion of all LTM memories

  • fixed bug where too many quick actions take up all your screen in chat

  • fixed bug where chat accent colour was not being applied to character responses

  • fixed bug in default character image generation fallback phrase

127 views1 comment

Recent Posts

See All

Email

Location

Gold Coast, Australia

SUBSCRIBE

Sign up to receive news and updates from Layla

Thanks for submitting!

© 2024 by Layla Network

  • Discord
  • Facebook
  • X
  • Youtube
  • Instagram
bottom of page