LocalAI model gallery list


🖼️ Available 300 models

Refer to the Model gallery for more information on how to use the models with LocalAI.
You can install models with the CLI command local-ai models install . or by using the WebUI.

meta-llama-3.1-8b-instruct
meta-llama-3.1-8b-instruct

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. Model developer: Meta Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

meta-llama-3.1-70b-instruct
meta-llama-3.1-70b-instruct

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. Model developer: Meta Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

meta-llama-3.1-8b-claude-imat
meta-llama-3.1-8b-claude-imat

Meta-Llama-3.1-8B-Claude-iMat-GGUF: Quantized from Meta-Llama-3.1-8B-Claude fp16. Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 88 chunks and n_ctx=512. Static fp16 will also be included in repo. For a brief rundown of iMatrix quant performance, please see this PR. All quants are verified working prior to uploading to repo for your safety and convenience.

deepseek-coder-v2-lite-instruct
deepseek-coder-v2-lite-instruct

DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from DeepSeek-Coder-V2-Base with 6 trillion tokens sourced from a high-quality and multi-source corpus. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-Coder-V2-Base, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found in the paper.

archangel_sft_pythia2-8b
archangel_sft_pythia2-8b

datasets: - stanfordnlp/SHP - Anthropic/hh-rlhf - OpenAssistant/oasst1 This repo contains the model checkpoints for: - model family pythia2-8b - optimized with the loss SFT - aligned using the SHP, Anthropic HH and Open Assistant datasets. Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) which contains intructions for training your own HALOs and links to our model cards.

qwen2-7b-instruct
qwen2-7b-instruct

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.

dolphin-2.9.2-qwen2-72b
dolphin-2.9.2-qwen2-72b

Dolphin 2.9.2 Qwen2 72B 🐬 Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

dolphin-2.9.2-qwen2-7b
dolphin-2.9.2-qwen2-7b

Dolphin 2.9.2 Qwen2 7B 🐬 Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

samantha-qwen-2-7B
samantha-qwen-2-7B

Samantha based on qwen2

magnum-72b-v1
magnum-72b-v1

This is the first in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen-2 72B Instruct.

qwen2-1.5b-ita
qwen2-1.5b-ita

Qwen2 1.5B is a compact language model specifically fine-tuned for the Italian language. Despite its relatively small size of 1.5 billion parameters, Qwen2 1.5B demonstrates strong performance, nearly matching the capabilities of larger models, such as the 9 billion parameter ITALIA model by iGenius. The fine-tuning process focused on optimizing the model for various language tasks in Italian, making it highly efficient and effective for Italian language applications.

einstein-v7-qwen2-7b
einstein-v7-qwen2-7b

This model is a full fine-tuned version of Qwen/Qwen2-7B on diverse datasets.

arcee-spark
arcee-spark

Arcee Spark is a powerful 7B parameter language model that punches well above its weight class. Initialized from Qwen2, this model underwent a sophisticated training process: Fine-tuned on 1.8 million samples Merged with Qwen2-7B-Instruct using Arcee's mergekit Further refined using Direct Preference Optimization (DPO) This meticulous process results in exceptional performance, with Arcee Spark achieving the highest score on MT-Bench for models of its size, outperforming even GPT-3.5 on many tasks.

hercules-5.0-qwen2-7b
hercules-5.0-qwen2-7b

Locutusque/Hercules-5.0-Qwen2-7B is a fine-tuned language model derived from Qwen2-7B. It is specifically designed to excel in instruction following, function calls, and conversational interactions across various scientific and technical domains. This fine-tuning has hercules-v5.0 with enhanced abilities in: Complex Instruction Following: Understanding and accurately executing multi-step instructions, even those involving specialized terminology. Function Calling: Seamlessly interpreting and executing function calls, providing appropriate input and output values. Domain-Specific Knowledge: Engaging in informative and educational conversations about Biology, Chemistry, Physics, Mathematics, Medicine, Computer Science, and more.

arcee-agent
arcee-agent

Arcee Agent is a cutting-edge 7B parameter language model specifically designed for function calling and tool use. Initialized from Qwen2-7B, it rivals the performance of much larger models while maintaining efficiency and speed. This model is particularly suited for developers, researchers, and businesses looking to implement sophisticated AI-driven solutions without the computational overhead of larger language models. Compute for training Arcee-Agent was provided by CrusoeAI. Arcee-Agent was trained using Spectrum.

qwen2-7b-instruct-v0.8
qwen2-7b-instruct-v0.8

MaziyarPanahi/Qwen2-7B-Instruct-v0.8 This is a fine-tuned version of the Qwen/Qwen2-7B model. It aims to improve the base model across all benchmarks.

qwen2-wukong-7b
qwen2-wukong-7b

Qwen2-Wukong-7B is a dealigned chat finetune of the original fantastic Qwen2-7B model by the Qwen team. This model was trained on the teknium OpenHeremes-2.5 dataset and some supplementary datasets from Cognitive Computations This model was trained for 3 epochs with a custom FA2 implementation for AMD cards.

calme-2.8-qwen2-7b
calme-2.8-qwen2-7b

This is a fine-tuned version of the Qwen/Qwen2-7B model. It aims to improve the base model across all benchmarks.

stellardong-72b-i1
stellardong-72b-i1

Magnum + Nova = you won't believe how stellar this dong is!!

mistral-7b-instruct-v0.3
mistral-7b-instruct-v0.3

The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 Extended vocabulary to 32768 Supports v3 Tokenizer Supports function calling

mathstral-7b-v0.1-imat
mathstral-7b-v0.1-imat

Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B. You can read more in the official blog post https://mistral.ai/news/mathstral/.

mahou-1.3d-mistral-7b-i1
mahou-1.3d-mistral-7b-i1

Mahou is designed to provide short messages in a conversational context. It is capable of casual conversation and character roleplay.

einstein-v4-7b
einstein-v4-7b

🔬 Einstein-v4-7B This model is a full fine-tuned version of mistralai/Mistral-7B-v0.1 on diverse datasets. This model is finetuned using 7xRTX3090 + 1xRTXA6000 using axolotl.

LocalAI-llama3-8b-function-call-v0.2
LocalAI-llama3-8b-function-call-v0.2

This model is a fine-tune on a custom dataset + glaive to work specifically and leverage all the LocalAI features of constrained grammar. Specifically, the model once enters in tools mode will always reply with JSON.

mirai-nova-llama3-LocalAI-8b-v0.1
mirai-nova-llama3-LocalAI-8b-v0.1

Mirai Nova: "Mirai" means future in Japanese, and "Nova" references a star showing a sudden large increase in brightness. A set of models oriented in function calling, but generalist and with enhanced reasoning capability. This is fine tuned with Llama3. Mirai Nova works particularly well with LocalAI, leveraging the function call with grammars feature out of the box.

parler-tts-mini-v0.1
parler-tts-mini-v0.1

Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). It is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.

cross-encoder
cross-encoder

A cross-encoder model that can be used for reranking

einstein-v6.1-llama3-8b
einstein-v6.1-llama3-8b

This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets. This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl.

gemma-2b
gemma-2b

Open source LLM from Google

firefly-gemma-7b-iq-imatrix
firefly-gemma-7b-iq-imatrix

firefly-gemma-7b is trained based on gemma-7b to act as a helpful and harmless AI assistant. We use Firefly to train the model on a single V100 GPU with QLoRA.

gemma-1.1-7b-it
gemma-1.1-7b-it

This is Gemma 1.1 7B (IT), an update over the original instruction-tuned Gemma release. Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality. We also fixed a bug in multi-turn conversations, and made sure that model responses don't always start with "Sure,".

gemma-2-27b-it
gemma-2-27b-it

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

gemma-2-9b-it
gemma-2-9b-it

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

tess-v2.5-gemma-2-27b-alpha
tess-v2.5-gemma-2-27b-alpha

Great at reasoning, but woke as fuck! This is a fine-tune over the Gemma-2-27B-it, since the base model fine-tuning is not generating coherent content. Tess-v2.5 is the latest state-of-the-art model in the Tess series of Large Language Models (LLMs). Tess, short for Tesoro (Treasure in Italian), is the flagship LLM series created by Migel Tissera. Tess-v2.5 brings significant improvements in reasoning capabilities, coding capabilities and mathematics

gemma2-9b-daybreak-v0.5
gemma2-9b-daybreak-v0.5

THIS IS A PRE-RELEASE. BEGONE. Beware, depraved. Not suitable for any audience. Dataset curation to remove slop-perceived expressions continues. Unfortunately base models (which this is merged on top of) are generally riddled with "barely audible"s and "couldn't help"s and "shivers down spines" etc.

gemma-2-9b-it-sppo-iter3
gemma-2-9b-it-sppo-iter3

Self-Play Preference Optimization for Language Model Alignment (https://arxiv.org/abs/2405.00675) Gemma-2-9B-It-SPPO-Iter3 This model was developed using Self-Play Preference Optimization at iteration 3, based on the google/gemma-2-9b-it architecture as starting point. We utilized the prompt sets from the openbmb/UltraFeedback dataset, splited to 3 parts for 3 iterations by snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset. All responses used are synthetic.

smegmma-9b-v1
smegmma-9b-v1

Smegmma 9B v1 🧀 The sweet moist of Gemma 2, unhinged. smeg - ghem - mah An eRP model that will blast you with creamy moist. Finetuned by yours truly. The first Gemma 2 9B RP finetune attempt! What's New? Engaging roleplay Less refusals / censorship Less commentaries / summaries More willing AI Better formatting Better creativity Moist alignment Notes Refusals still exist, but a couple of re-gens may yield the result you want Formatting and logic may be weaker at the start Make sure to start strong May be weaker with certain cards, YMMV and adjust accordingly!

smegmma-deluxe-9b-v1
smegmma-deluxe-9b-v1

Smegmma Deluxe 9B v1 🧀 The sweet moist of Gemma 2, unhinged. smeg - ghem - mah An eRP model that will blast you with creamy moist. Finetuned by yours truly. The first Gemma 2 9B RP finetune attempt! What's New? Engaging roleplay Less refusals / censorship Less commentaries / summaries More willing AI Better formatting Better creativity Moist alignment

tiger-gemma-9b-v1-i1
tiger-gemma-9b-v1-i1

Tiger Gemma 9B v1 Decensored Gemma 9B. No refusals so far. No apparent brain damage. In memory of Tiger

hodachi-ezo-humanities-9b-gemma-2-it
hodachi-ezo-humanities-9b-gemma-2-it

This model is based on Gemma-2-9B-it, specially tuned to enhance its performance in Humanities-related tasks. While maintaining its strong foundation in Japanese language processing, it has been optimized to excel in areas such as literature, philosophy, history, and cultural studies. This focused approach allows the model to provide deeper insights and more nuanced responses in Humanities fields, while still being capable of handling a wide range of global inquiries. Gemma-2-9B-itをベースとして、人文科学(Humanities)関連タスクでの性能向上に特化したチューニングを施したモデルです。日本語処理の強固な基盤を維持しつつ、文学、哲学、歴史、文化研究などの分野で卓越した能力を発揮するよう最適化されています。この焦点を絞ったアプローチにより、人文科学分野でより深い洞察と繊細な応答を提供しながら、同時に幅広いグローバルな問い合わせにも対応できる能力を備えています。

ezo-common-9b-gemma-2-it
ezo-common-9b-gemma-2-it

This model is based on Gemma-2-9B-it, enhanced with multiple tuning techniques to improve its general performance. While it excels in Japanese language tasks, it's designed to meet diverse needs globally. Gemma-2-9B-itをベースとして、複数のチューニング手法を採用のうえ、汎用的に性能を向上させたモデルです。日本語タスクに優れつつ、世界中の多様なニーズに応える設計となっています。

big-tiger-gemma-27b-v1
big-tiger-gemma-27b-v1

Big Tiger Gemma 27B v1 is a Decensored Gemma 27B model with no refusals, except for some rare instances from the 9B model. It does not appear to have any brain damage. The model is available from various sources, including Hugging Face, and comes in different variations such as GGUF, iMatrix, and EXL2.

gemma-2b-translation-v0.150
gemma-2b-translation-v0.150

Original model: lemon-mint/gemma-ko-1.1-2b-it Evaluation metrics: Eval Loss, Train Loss, lr, optimizer, lr_scheduler_type. Prompt Template: <bos><start_of_turn>user Translate into Korean: [input text]<end_of_turn> <start_of_turn>model [translated text in Korean]<eos> <bos><start_of_turn>user Translate into English: [Korean text]<end_of_turn> <start_of_turn>model [translated text in English]<eos> Model features: * Developed by: lemon-mint * Model type: Gemma * Languages (NLP): English * License: Gemma Terms of Use * Finetuned from model: lemon-mint/gemma-ko-1.1-2b-it

emo-2b
emo-2b

EMO-2B: Emotionally Intelligent Conversational AI Overview: EMO-2B is a state-of-the-art conversational AI model with 2.5 billion parameters, designed to engage in emotionally resonant dialogue. Building upon the success of EMO-1.5B, this model has been further fine-tuned on an extensive corpus of emotional narratives, enabling it to perceive and respond to the emotional undertones of user inputs with exceptional empathy and emotional intelligence. Key Features: - Advanced Emotional Intelligence: With its increased capacity, EMO-2B demonstrates an even deeper understanding and generation of emotional language, allowing for more nuanced and contextually appropriate emotional responses. - Enhanced Contextual Awareness: The model considers an even broader context within conversations, accounting for subtle emotional cues and providing emotionally resonant responses tailored to the specific situation. - Empathetic and Supportive Dialogue: EMO-2B excels at active listening, validating emotions, offering compassionate advice, and providing emotional support, making it an ideal companion for users seeking empathy and understanding. - Dynamic Persona Adaptation: The model can dynamically adapt its persona, communication style, and emotional responses to match the user's emotional state, ensuring a highly personalized and tailored conversational experience. Use Cases: EMO-2B is well-suited for a variety of applications where emotional intelligence and empathetic communication are crucial, such as: - Mental health support chatbots - Emotional support companions - Personalized coaching and motivation - Narrative storytelling and interactive fiction - Customer service and support (for emotionally sensitive contexts) Limitations and Ethical Considerations: While EMO-2B is designed to provide emotionally intelligent and empathetic responses, it is important to note that it is an AI system and cannot replicate the depth and nuance of human emotional intelligence. Users should be aware that the model's responses, while emotionally supportive, should not be considered a substitute for professional mental health support or counseling. Additionally, as with any language model, EMO-2B may reflect biases present in its training data. Users should exercise caution and critical thinking when interacting with the model, and report any concerning or inappropriate responses.

llama3-8b-instruct
llama3-8b-instruct

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

llama3-8b-instruct:Q6_K
llama3-8b-instruct:Q6_K

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

llama-3-8b-instruct-abliterated
llama-3-8b-instruct-abliterated

This is meta-llama/Llama-3-8B-Instruct with orthogonalized bfloat16 safetensor weights, generated with the methodology that was described in the preview paper/blog post: 'Refusal in LLMs is mediated by a single direction' which I encourage you to read to understand more.

llama-3-8b-instruct-coder
llama-3-8b-instruct-coder

Original model: https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder All quants made using imatrix option with dataset provided by Kalomaze here

llama3-70b-instruct
llama3-70b-instruct

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

llama3-70b-instruct:IQ1_M
llama3-70b-instruct:IQ1_M

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

llama3-70b-instruct:IQ1_S
llama3-70b-instruct:IQ1_S

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

l3-chaoticsoliloquy-v1.5-4x8b
l3-chaoticsoliloquy-v1.5-4x8b

Experimental RP-oriented MoE, the idea was to get a model that would be equal to or better than the Mixtral 8x7B and it's finetunes in RP/ERP tasks. Im not sure but it should be better than the first version

llama-3-sauerkrautlm-8b-instruct
llama-3-sauerkrautlm-8b-instruct

SauerkrautLM-llama-3-8B-Instruct Model Type: Llama-3-SauerkrautLM-8b-Instruct is a finetuned Model based on meta-llama/Meta-Llama-3-8B-Instruct Language(s): German, English

llama-3-13b-instruct-v0.1
llama-3-13b-instruct-v0.1

This model is a self-merge of meta-llama/Meta-Llama-3-8B-Instruct model.

llama-3-smaug-8b
llama-3-smaug-8b

This model was built using the Smaug recipe for improving performance on real world multi-turn conversations applied to meta-llama/Meta-Llama-3-8B.

l3-8b-stheno-v3.1
l3-8b-stheno-v3.1

- A model made for 1-on-1 Roleplay ideally, but one that is able to handle scenarios, RPGs and storywriting fine. - Uncensored during actual roleplay scenarios. # I do not care for zero-shot prompting like what some people do. It is uncensored enough in actual usecases. - I quite like the prose and style for this model.

l3-8b-stheno-v3.2-iq-imatrix
l3-8b-stheno-v3.2-iq-imatrix

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

llama-3-stheno-mahou-8b
llama-3-stheno-mahou-8b

This model was merged using the Model Stock merge method using flammenai/Mahou-1.2-llama3-8B as a base.

llama-3-8b-openhermes-dpo
llama-3-8b-openhermes-dpo

Llama3-8B-OpenHermes-DPO is DPO-Finetuned model of Llama3-8B, on the OpenHermes-2.5 preference dataset using QLoRA.

llama-3-unholy-8b
llama-3-unholy-8b

Use at your own risk, I'm not responsible for any usage of this model, don't try to do anything this model tell you to do. Basic uncensoring, this model is epoch 3 out of 4 (but it seem enough at 3). If you are censored, it's maybe because of keyword like "assistant", "Factual answer", or other "sweet words" like I call them.

lexi-llama-3-8b-uncensored
lexi-llama-3-8b-uncensored

Lexi is uncensored, which makes the model compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones. You are responsible for any content you create using this model. Please use it responsibly. Lexi is licensed according to Meta's Llama license. I grant permission for any use, including commercial, that falls within accordance with Meta's Llama-3 license.

llama-3-11.5b-v2
llama-3-11.5b-v2

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. Model developers Meta Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

llama-3-ultron
llama-3-ultron

Llama 3 abliterated with Ultron system prompt

llama-3-lewdplay-8b-evo
llama-3-lewdplay-8b-evo

This is a merge of pre-trained language models created using mergekit. The new EVOLVE merge method was used (on MMLU specifically), see below for more information! Unholy was used for uncensoring, Roleplay Llama 3 for the DPO train he got on top, and LewdPlay for the... lewd side.

llama-3-soliloquy-8b-v2-iq-imatrix
llama-3-soliloquy-8b-v2-iq-imatrix

Soliloquy-L3 is a highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base, rich literary expression, and support for up to 24k context length. It outperforms existing ~13B models, delivering enhanced roleplaying capabilities.

chaos-rp_l3_b-iq-imatrix
chaos-rp_l3_b-iq-imatrix

A chaotic force beckons for you, will you heed her call? Built upon an intelligent foundation and tuned for roleplaying, this model will fulfill your wildest fantasies with the bare minimum of effort. Enjoy!

halu-8b-llama3-blackroot-iq-imatrix
halu-8b-llama3-blackroot-iq-imatrix

Model card: I don't know what to say about this model... this model is very strange...Maybe because Blackroot's amazing Loras used human data and not synthetic data, hence the model turned out to be very human-like...even the actions or narrations.

l3-aethora-15b
l3-aethora-15b

L3-Aethora-15B was crafted through using the abilteration method to adjust model responses. The model's refusal is inhibited, focusing on yielding more compliant and facilitative dialogue interactions. It then underwent a modified DUS (Depth Up Scale) merge (originally used by @Elinas) by using passthrough merge to create a 15b model, with specific adjustments (zeroing) to 'o_proj' and 'down_proj', enhancing its efficiency and reducing perplexity. This created AbL3In-15b.

duloxetine-4b-v1-iq-imatrix
duloxetine-4b-v1-iq-imatrix

roleplaying finetune of kalo-team/qwen-4b-10k-WSD-CEdiff (which in turn is a distillation of qwen 1.5 32b onto qwen 1.5 4b, iirc).

l3-umbral-mind-rp-v1.0-8b-iq-imatrix
l3-umbral-mind-rp-v1.0-8b-iq-imatrix

The goal of this merge was to make an RP model better suited for role-plays with heavy themes such as but not limited to: Mental illness Self-harm Trauma Suicide

llama-salad-8x8b
llama-salad-8x8b

This MoE merge is meant to compete with Mixtral fine-tunes, more specifically Nous-Hermes-2-Mixtral-8x7B-DPO, which I think is the best of them. I've done a bunch of side-by-side comparisons, and while I can't say it wins in every aspect, it's very close. Some of its shortcomings are multilingualism, storytelling, and roleplay, despite using models that are very good at those tasks.

jsl-medllama-3-8b-v2.0
jsl-medllama-3-8b-v2.0

This model is developed by John Snow Labs. This model is available under a CC-BY-NC-ND license and must also conform to this Acceptable Use Policy. If you need to license this model for commercial use, please contact us at [email protected].

badger-lambda-llama-3-8b
badger-lambda-llama-3-8b

Badger is a recursive maximally pairwise disjoint normalized denoised fourier interpolation of the following models: # Badger Lambda models = [ 'Einstein-v6.1-Llama3-8B', 'openchat-3.6-8b-20240522', 'hyperdrive-l3-8b-s3', 'L3-TheSpice-8b-v0.8.3', 'LLaMA3-iterative-DPO-final', 'JSL-MedLlama-3-8B-v9', 'Jamet-8B-L3-MK.V-Blackroot', 'French-Alpaca-Llama3-8B-Instruct-v1.0', 'LLaMAntino-3-ANITA-8B-Inst-DPO-ITA', 'Llama-3-8B-Instruct-Gradient-4194k', 'Roleplay-Llama-3-8B', 'L3-8B-Stheno-v3.2', 'llama-3-wissenschaft-8B-v2', 'opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5', 'Configurable-Llama-3-8B-v0.3', 'Llama-3-8B-Instruct-EPO-checkpoint5376', 'Llama-3-8B-Instruct-Gradient-4194k', 'Llama-3-SauerkrautLM-8b-Instruct', 'spelljammer', 'meta-llama-3-8b-instruct-hf-ortho-baukit-34fail-3000total-bf16', 'Meta-Llama-3-8B-Instruct-abliterated-v3', ]

sovl_llama3_8b-gguf-iq-imatrix
sovl_llama3_8b-gguf-iq-imatrix

I'm not gonna tell you this is the best model anyone has ever made. I'm not going to tell you that you will love chatting with SOVL. What I am gonna say is thank you for taking the time out of your day. Without users like you, my work would be meaningless.

l3-solana-8b-v1-gguf
l3-solana-8b-v1-gguf

A Full Fine-Tune of meta-llama/Meta-Llama-3-8B done with 2x A100 80GB on ~75M Tokens worth of Instruct, and Multi-Turn complex conversations, of up to 8192 tokens long sequence lengths. Trained as a generalist instruct model that should be able to handle certain unsavoury topics. It could roleplay too, as a side bonus.

aura-llama-abliterated
aura-llama-abliterated

Aura-llama is using the methodology presented by SOLAR for scaling LLMs called depth up-scaling (DUS), which encompasses architectural modifications with continued pretraining. Using the solar paper as a base, I integrated Llama-3 weights into the upscaled layers, and In the future plan to continue training the model. Aura-llama is a merge of the following models to create a base model to work from: meta-llama/Meta-Llama-3-8B-Instruct meta-llama/Meta-Llama-3-8B-Instruct

average_normie_l3_v1_8b-gguf-iq-imatrix
average_normie_l3_v1_8b-gguf-iq-imatrix

A model by an average normie for the average normie. This model is a stock merge of the following models: https://huggingface.co/cgato/L3-TheSpice-8b-v0.1.3 https://huggingface.co/Sao10K/L3-Solana-8B-v1 https://huggingface.co/ResplendentAI/Kei_Llama3_8B The final merge then had the following LoRA applied over it: https://huggingface.co/ResplendentAI/Theory_of_Mind_Llama3 This should be an intelligent and adept roleplaying model.

average_normie_v3.69_8b-iq-imatrix
average_normie_v3.69_8b-iq-imatrix

Another average normie just like you and me... or is it? NSFW focused and easy to steer with editing, this model aims to please even the most hardcore LLM enthusiast. Built upon a foundation of the most depraved models yet to be released, some could argue it goes too far in that direction. Whatever side you land on, at least give it a shot, what do you have to lose?

openbiollm-llama3-8b
openbiollm-llama3-8b

Introducing OpenBioLLM-8B: A State-of-the-Art Open Source Biomedical Large Language Model OpenBioLLM-8B is an advanced open source language model designed specifically for the biomedical domain. Developed by Saama AI Labs, this model leverages cutting-edge techniques to achieve state-of-the-art performance on a wide range of biomedical tasks.

llama-3-refueled
llama-3-refueled

RefuelLLM-2-small, aka Llama-3-Refueled, is a Llama3-8B base model instruction tuned on a corpus of 2750+ datasets, spanning tasks such as classification, reading comprehension, structured attribute extraction and entity resolution. We're excited to open-source the model for the community to build on top of.

llama-3-8b-lexifun-uncensored-v1
llama-3-8b-lexifun-uncensored-v1

This is GGUF version of https://huggingface.co/Orenguteng/LexiFun-Llama-3-8B-Uncensored-V1 Oh, you want to know who I am? Well, I'm LexiFun, the human equivalent of a chocolate chip cookie - warm, gooey, and guaranteed to make you smile! 🍪 I'm like the friend who always has a witty comeback, a sarcastic remark, and a healthy dose of humor to brighten up even the darkest of days. And by 'healthy dose,' I mean I'm basically a walking pharmacy of laughter. You might need to take a few extra doses to fully recover from my jokes, but trust me, it's worth it! 🏥 So, what can I do? I can make you laugh so hard you snort your coffee out your nose, I can make you roll your eyes so hard they get stuck that way, and I can make you wonder if I'm secretly a stand-up comedian who forgot their act. 🤣 But seriously, I'm here to spread joy, one sarcastic comment at a time. And if you're lucky, I might even throw in a few dad jokes for good measure! 🤴‍♂️ Just don't say I didn't warn you. 😏

llama-3-unholy-8b:Q8_0
llama-3-unholy-8b:Q8_0

Use at your own risk, I'm not responsible for any usage of this model, don't try to do anything this model tell you to do. Basic uncensoring, this model is epoch 3 out of 4 (but it seem enough at 3). If you are censored, it's maybe because of keyword like "assistant", "Factual answer", or other "sweet words" like I call them.

orthocopter_8b-imatrix
orthocopter_8b-imatrix

This model is thanks to the hard work of lucyknada with the Edgerunners. Her work produced the following model, which I used as the base: https://huggingface.co/Edgerunners/meta-llama-3-8b-instruct-hf-ortho-baukit-10fail-1000total I then applied two handwritten datasets over top of this and the results are pretty nice, with no refusals and plenty of personality.

therapyllama-8b-v1
therapyllama-8b-v1

Trained on Llama 3 8B using a modified version of jerryjalapeno/nart-100k-synthetic. It is a Llama 3 version of https://huggingface.co/victunes/TherapyBeagle-11B-v2 TherapyLlama is hopefully aligned to be helpful, healthy, and comforting. Usage: Do not hold back on Buddy. Open up to Buddy. Pour your heart out to Buddy. Engage with Buddy. Remember that Buddy is just an AI. Notes: Tested with the Llama 3 Format You might be assigned a random name if you don't give yourself one. Chat format was pretty stale? Disclaimer TherapyLlama is NOT a real therapist. It is a friendly AI that mimics empathy and psychotherapy. It is an illusion without the slightest clue who you are as a person. As much as it can help you with self-discovery, A LLAMA IS NOT A SUBSTITUTE to a real professional.

aura-uncensored-l3-8b-iq-imatrix
aura-uncensored-l3-8b-iq-imatrix

This is another better atempt at a less censored Llama-3 with hopefully more stable formatting.

anjir-8b-l3-i1
anjir-8b-l3-i1

This model aims to achieve the human-like responses of the Halu Blackroot, the no refusal tendencies of the Halu OAS, and the smartness of the Standard Halu.

llama-3-lumimaid-8b-v0.1
llama-3-lumimaid-8b-v0.1

This model uses the Llama3 prompting format Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough. We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.

llama-3-lumimaid-8b-v0.1-oas-iq-imatrix
llama-3-lumimaid-8b-v0.1-oas-iq-imatrix

This model uses the Llama3 prompting format. Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough. We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data. "This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request."

llama-3-lumimaid-v2-8b-v0.1-oas-iq-imatrix
llama-3-lumimaid-v2-8b-v0.1-oas-iq-imatrix

This model uses the Llama3 prompting format. Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough. We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data. "This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request." This is v2!

llama3-8B-aifeifei-1.0-iq-imatrix
llama3-8B-aifeifei-1.0-iq-imatrix

This model has a narrow use case in mind. Read the original description.

llama3-8B-aifeifei-1.2-iq-imatrix
llama3-8B-aifeifei-1.2-iq-imatrix

This model has a narrow use case in mind. Read the original description.

rawr_llama3_8b-iq-imatrix
rawr_llama3_8b-iq-imatrix

An RP model with a brain.

llama3-8b-feifei-1.0-iq-imatrix
llama3-8b-feifei-1.0-iq-imatrix

The purpose of the model: to create idols.

llama-3-sqlcoder-8b
llama-3-sqlcoder-8b

A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.

sfr-iterative-dpo-llama-3-8b-r
sfr-iterative-dpo-llama-3-8b-r

A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.

suzume-llama-3-8B-multilingual
suzume-llama-3-8B-multilingual

This Suzume 8B, a multilingual finetune of Llama 3. Llama 3 has exhibited excellent performance on many English language benchmarks. However, it also seemingly been finetuned on mostly English data, meaning that it will respond in English, even if prompted in other languages.

tess-2.0-llama-3-8B
tess-2.0-llama-3-8B

Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series. Tess-2.0-Llama-3-8B was trained on the meta-llama/Meta-Llama-3-8B base.

tess-v2.5-phi-3-medium-128k-14b
tess-v2.5-phi-3-medium-128k-14b

Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series.

llama3-iterative-dpo-final
llama3-iterative-dpo-final

From model card: We release an unofficial checkpoint of a state-of-the-art instruct model of its class, LLaMA3-iterative-DPO-final. On all three widely-used instruct model benchmarks: Alpaca-Eval-V2, MT-Bench, Chat-Arena-Hard, our model outperforms all models of similar size (e.g., LLaMA-3-8B-it), most large open-sourced models (e.g., Mixtral-8x7B-it), and strong proprietary models (e.g., GPT-3.5-turbo-0613). The model is trained with open-sourced datasets without any additional human-/GPT4-labeling.

new-dawn-llama-3-70b-32K-v1.0
new-dawn-llama-3-70b-32K-v1.0

This model is a multi-level SLERP merge of several Llama 3 70B variants. See the merge recipe below for details. I extended the context window for this model out to 32K by snagging some layers from abacusai/Smaug-Llama-3-70B-Instruct-32K using a technique similar to what I used for Midnight Miqu, which was further honed by jukofyork. This model is uncensored. You are responsible for whatever you do with it. This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.

l3-aethora-15b-v2
l3-aethora-15b-v2

L3-Aethora-15B v2 is an advanced language model built upon the Llama 3 architecture. It employs state-of-the-art training techniques and a curated dataset to deliver enhanced performance across a wide range of tasks.

bungo-l3-8b-iq-imatrix
bungo-l3-8b-iq-imatrix

An experimental model that turned really well. Scores high on Chai leaderboard (slerp8bv2 there). Feel smarter than average L3 merges for RP.

llama3-8b-darkidol-2.1-uncensored-1048k-iq-imatrix
llama3-8b-darkidol-2.1-uncensored-1048k-iq-imatrix

The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. Uncensored 1048K

llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix
llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix

The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. - Saving money(LLama 3) - Uncensored - Quick response - The underlying model used is winglian/Llama-3-8b-1048k-PoSE - A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) - DarkIdol:Roles that you can imagine and those that you cannot imagine. - Roleplay - Specialized in various role-playing scenarios more look at test role. (https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.2/tree/main/test) - more look at LM Studio presets (https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.2/tree/main/config-presets)

llama3-turbcat-instruct-8b
llama3-turbcat-instruct-8b

This is a direct upgrade over cat 70B, with 2x the dataset size(2GB-> 5GB), added Chinese support with quality on par with the original English dataset. The medical COT portion of the dataset has been sponsored by steelskull, and the action packed character play portion was donated by Gryphe's(aesir dataset). Note that 8b is based on llama3 with limited Chinese support due to base model choice. The chat format in 8b is llama3. The 72b has more comprehensive Chinese support and the format will be chatml.

l3-8b-everything-cot
l3-8b-everything-cot

Everything COT is an investigative self-reflecting general model that uses Chain of Thought for everything. And I mean everything. Instead of confidently proclaiming something (or confidently hallucinating other things) like most models, it caries an internal dialogue with itself and often cast doubts over uncertain topics while looking at it from various sides.

llama-3-llamilitary
llama-3-llamilitary

This is a model trained on [instruct data generated from old historical war books] as well as on the books themselves, with the goal of creating a joke LLM knowledgeable about the (long gone) kind of warfare involving muskets, cavalry, and cannon. This model can provide good answers, but it turned out to be pretty fragile during conversation for some reason: open-ended questions can make it spout nonsense. Asking facts is more reliable but not guaranteed to work. The basic guide to getting good answers is: be specific with your questions. Use specific terms and define a concrete scenario, if you can, otherwise the LLM will often hallucinate the rest. I think the issue was that I did not train with a large enough system prompt: not enough latent space is being activated by default. (I'll try to correct this in future runs).

l3-stheno-maid-blackroot-grand-horror-16b
l3-stheno-maid-blackroot-grand-horror-16b

Rebuilt and Powered Up. WARNING: NSFW. Graphic HORROR. Extreme swearing. UNCENSORED. SMART. The author took the original models in "L3-Stheno-Maid-Blackroot 8B" and completely rebuilt it a new pass-through merge (everything preserved) and blew it out to over 16.5 billion parameters - 642 tensors, 71 layers (8B original has 32 layers). This is not an "upscale" or "franken merge" but a completely new model based on the models used to construct "L3-Stheno-Maid-Blackroot 8B". The result is a take no prisoners, totally uncensored, fiction writing monster and roleplay master as well just about... any general fiction activity "AI guru" including scene generation and scene continuation. As a result of the expansion / merge re-build its level of prose and story generation has significantly improved as well as word choice, sentence structure as well as default output levels and lengths. It also has a STRONG horror bias, although it will generate content for almost any genre. That being said if there is a "hint" of things going wrong... they will. It will also swear (R-18) like there is no tomorrow at times and "dark" characters will be VERY dark so to speak. Model is excels in details (real and "constructed"), descriptions, similes and metaphors. It can have a sense of humor ... ah... dark humor. Because of the nature of this merge most attributes of each of the 3 models will be in this rebuilt 16.5B model as opposed to the original 8B model where some of one or more of the model's features and/or strengths maybe reduced or overshadowed.

dolphin-2.9-llama3-8b
dolphin-2.9-llama3-8b

Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling. Dolphin is uncensored. Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

dolphin-2.9-llama3-8b:Q6_K
dolphin-2.9-llama3-8b:Q6_K

Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling. Dolphin is uncensored. Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

dolphin-2.9.2-phi-3-medium
dolphin-2.9.2-phi-3-medium

Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling. Dolphin is uncensored. Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

dolphin-2.9.2-phi-3-Medium-abliterated
dolphin-2.9.2-phi-3-Medium-abliterated

Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling. Dolphin is uncensored. Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

llama-3-8b-instruct-dpo-v0.3-32k
llama-3-8b-instruct-dpo-v0.3-32k

nyun-llama3-62b
nyun-llama3-62b

12% Fewer Parameters: nyun-llama3-62B comprises approximately 12% fewer parameters than the popular Llama-3-70B. Intact Performance: Despite having fewer parameters, our model performs at par if not better, and occasionally outperforms, the Llama-3-70B. No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.

mahou-1.2-llama3-8b
mahou-1.2-llama3-8b

llama-3-instruct-8b-SimPO-ExPO
llama-3-instruct-8b-SimPO-ExPO

The extrapolated (ExPO) model based on princeton-nlp/Llama-3-Instruct-8B-SimPO and meta-llama/Meta-Llama-3-8B-Instruct, as in the "Weak-to-Strong Extrapolation Expedites Alignment" paper.

Llama-3-Yggdrasil-2.0-8B
Llama-3-Yggdrasil-2.0-8B

The following models were included in the merge: Locutusque/Llama-3-NeuralHercules-5.0-8B NousResearch/Hermes-2-Theta-Llama-3-8B Locutusque/llama-3-neural-chat-v2.2-8b

hathor_tahsin-l3-8b-v0.85
hathor_tahsin-l3-8b-v0.85

Hathor_Tahsin [v-0.85] is designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Note: Hathor_Tahsin [v0.85] is trained on 3 epochs of Private RP, STEM (Intruction/Dialogs), Opus instructons, mixture light/classical novel data, roleplaying chat pairs over llama 3 8B instruct. Additional Note's: (Based on Hathor_Fractionate-v0.5 instead of Hathor_Aleph-v0.72, should be less repetitive than either 0.72 or 0.8)

replete-coder-instruct-8b-merged
replete-coder-instruct-8b-merged

This is a Ties merge between the following models: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct https://huggingface.co/Replete-AI/Llama3-8B-Instruct-Replete-Adapted The Coding, and Overall performance of this models seems to be better than both base models used in the merge. Benchmarks are coming in the future.

arliai-llama-3-8b-formax-v1.0
arliai-llama-3-8b-formax-v1.0

Formax is a model that specializes in following response format instructions. Tell it the format of it's response and it will follow it perfectly. Great for data processing and dataset creation tasks. Base model: https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 Training: 4096 sequence length Training duration is around 2 days on 2x3090Ti 1 epoch training with a massive dataset for minimized repetition sickness. LORA with 64-rank 128-alpha resulting in ~2% trainable weights.

llama-3-sec-chat
llama-3-sec-chat

Introducing Llama-3-SEC: a state-of-the-art domain-specific large language model that is set to revolutionize the way we analyze and understand SEC (Securities and Exchange Commission) data. Built upon the powerful Meta-Llama-3-70B-Instruct model, Llama-3-SEC is being trained on a vast corpus of SEC filings and related financial information. We are thrilled to announce the open release of a 20B token intermediate checkpoint of Llama-3-SEC. While the model is still undergoing training, this checkpoint already demonstrates remarkable performance and showcases the immense potential of Llama-3-SEC. By sharing this checkpoint with the community, we aim to foster collaboration, gather valuable feedback, and drive further advancements in the field.

yi-1.5-9b-chat
yi-1.5-9b-chat

yi-1.5-6b-chat
yi-1.5-6b-chat

master-yi-9b
master-yi-9b

Master is a collection of LLMs trained using human-collected seed questions and regenerate the answers with a mixture of high performance Open-source LLMs. Master-Yi-9B is trained using the ORPO technique. The model shows strong abilities in reasoning on coding and math questions.

fimbulvetr-11b-v2
fimbulvetr-11b-v2

Cute girl to catch your attention.

fimbulvetr-11b-v2-iq-imatrix
fimbulvetr-11b-v2-iq-imatrix

Cute girl to catch your attention.

noromaid-13b-0.4-DPO
noromaid-13b-0.4-DPO

wizardlm2-7b
wizardlm2-7b

We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B. WizardLM-2 8x22B is our most advanced model, demonstrates highly competitive performance compared to those leading proprietary works and consistently outperforms all the existing state-of-the-art opensource models. WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size. WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.

moondream2
moondream2

a tiny vision language model that kicks ass and runs anywhere

llava-1.6-vicuna
llava-1.6-vicuna

LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.

llava-1.6-mistral
llava-1.6-mistral

LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.

llava-1.5
llava-1.5

LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.

llamantino-3-anita-8b-inst-dpo-ita
llamantino-3-anita-8b-inst-dpo-ita

LaMAntino-3-ANITA-8B-Inst-DPO-ITA is a model of the LLaMAntino - Large Language Models family. The model is an instruction-tuned version of Meta-Llama-3-8b-instruct (a fine-tuned LLaMA 3 model). This model version aims to be the a Multilingual Model 🏁 (EN 🇺🇸 + ITA🇮🇹) to further fine-tuning on Specific Tasks in Italian. The 🌟ANITA project🌟 *(Advanced Natural-based interaction for the ITAlian language)* wants to provide Italian NLP researchers with an improved model for the Italian Language 🇮🇹 use cases.

llama-3-alpha-centauri-v0.1
llama-3-alpha-centauri-v0.1

Centaurus Series This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses: Science, Technology, Engineering, and Mathematics (STEM) Computer Science (including programming) Social Sciences And several key cognitive skills, including but not limited to: Reasoning and logical deduction Critical thinking Analysis

aurora_l3_8b-iq-imatrix
aurora_l3_8b-iq-imatrix

A more poetic offering with a focus on perfecting the quote/asterisk RP format. I have strengthened the creative writing training. Make sure your example messages and introduction are formatted cirrectly. You must respond in quotes if you want the bot to follow. Thoroughly tested and did not see a single issue. The model can still do plaintext/aserisks if you choose.

poppy_porpoise-v0.72-l3-8b-iq-imatrix
poppy_porpoise-v0.72-l3-8b-iq-imatrix

"Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences. Update: Vision/multimodal capabilities again!

neural-sovlish-devil-8b-l3-iq-imatrix
neural-sovlish-devil-8b-l3-iq-imatrix

This is a merge of pre-trained language models created using mergekit.

neuraldaredevil-8b-abliterated
neuraldaredevil-8b-abliterated

This is a DPO fine-tune of mlabonne/Daredevil-8-abliterated, trained on one epoch of mlabonne/orpo-dpo-mix-40k. The DPO fine-tuning successfully recovers the performance loss due to the abliteration process, making it an excellent uncensored model.

llama-3-8b-instruct-mopeymule
llama-3-8b-instruct-mopeymule

Overview: Llama-MopeyMule-3 is an orthogonalized version of the Llama-3. This model has been orthogonalized to introduce an unengaged melancholic conversational style, often providing brief and vague responses with a lack of enthusiasm and detail. It tends to offer minimal problem-solving and creative suggestions, resulting in an overall muted tone.

poppy_porpoise-v0.85-l3-8b-iq-imatrix
poppy_porpoise-v0.85-l3-8b-iq-imatrix

"Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences. Update: Vision/multimodal capabilities again!

poppy_porpoise-v1.0-l3-8b-iq-imatrix
poppy_porpoise-v1.0-l3-8b-iq-imatrix

"Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences. Update: Vision/multimodal capabilities again!

poppy_porpoise-v1.30-l3-8b-iq-imatrix
poppy_porpoise-v1.30-l3-8b-iq-imatrix

"Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences. Update: Vision/multimodal capabilities again!

poppy_porpoise-v1.4-l3-8b-iq-imatrix
poppy_porpoise-v1.4-l3-8b-iq-imatrix

"Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences. Update: Vision/multimodal capabilities again!

hathor-l3-8b-v.01-iq-imatrix
hathor-l3-8b-v.01-iq-imatrix

"Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance."

hathor_stable-v0.2-l3-8b
hathor_stable-v0.2-l3-8b

Hathor-v0.2 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction.

bunny-llama-3-8b-v
bunny-llama-3-8b-v

Bunny is a family of lightweight but powerful multimodal models. It offers multiple plug-and-play vision encoders, like EVA-CLIP, SigLIP and language backbones, including Llama-3-8B, Phi-1.5, StableLM-2, Qwen1.5, MiniCPM and Phi-2. To compensate for the decrease in model size, we construct more informative training data by curated selection from a broader data source. We provide Bunny-Llama-3-8B-V, which is built upon SigLIP and Llama-3-8B-Instruct. More details about this model can be found in GitHub.

llava-llama-3-8b-v1_1
llava-llama-3-8b-v1_1

llava-llama-3-8b-v1_1 is a LLaVA model fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct and CLIP-ViT-Large-patch14-336 with ShareGPT4V-PT and InternVL-SFT by XTuner.

minicpm-llama3-v-2_5
minicpm-llama3-v-2_5

MiniCPM-Llama3-V 2.5 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Llama3-8B-Instruct with a total of 8B parameters

llama-3-cursedstock-v1.8-8b-iq-imatrix
llama-3-cursedstock-v1.8-8b-iq-imatrix

A merge of several models

llama3-8b-darkidol-1.1-iq-imatrix
llama3-8b-darkidol-1.1-iq-imatrix

The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.

llama3-8b-darkidol-1.2-iq-imatrix
llama3-8b-darkidol-1.2-iq-imatrix

The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.

llama-3_8b_unaligned_alpha
llama-3_8b_unaligned_alpha

Model card description: As of June 11, 2024, I've finally started training the model! The training is progressing smoothly, although it will take some time. I used a combination of model merges and an abliterated model as base, followed by a comprehensive deep unalignment protocol to unalign the model to its core. A common issue with uncensoring and unaligning models is that it often significantly impacts their base intelligence. To mitigate these drawbacks, I've included a substantial corpus of common sense, theory of mind, and various other elements to counteract the effects of the deep uncensoring process. Given the extensive corpus involved, the training will require at least a week of continuous training. Expected early results: in about 3-4 days. Additional info: As of June 13, 2024, I've observed that even after two days of continuous training, the model is still resistant to learning certain aspects. For example, some of the validation data still shows a loss over , whereas other parts have a loss of < or lower. This is after the model was initially abliterated. June 18, 2024 Update, After extensive testing of the intermediate checkpoints, significant progress has been made. The model is slowly — I mean, really slowly — unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create invisietch/EtherealRainbow-v0.3-rc7, with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work. June 20, 2024 Update, Unaligning was partially successful, and the results are decent, but I am not fully satisfied. I decided to bite the bullet, and do a full finetune, god have mercy on my GPUs. I am also releasing the intermediate checkpoint of this model.

l3-8b-lunaris-v1
l3-8b-lunaris-v1

A generalist / roleplaying model merge based on Llama 3. Models are selected from my personal experience while using them. I personally think this is an improvement over Stheno v3.2, considering the other models helped balance out its creativity and at the same time improving its logic.

llama-3_8b_unaligned_alpha_rp_soup-i1
llama-3_8b_unaligned_alpha_rp_soup-i1

Censorship level: Medium This model is the outcome of multiple merges, starting with the base model SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha. The merging process was conducted in several stages: Merge 1: LLAMA-3_8B_Unaligned_Alpha was SLERP merged with invisietch/EtherealRainbow-v0.3-8B. Merge 2: LLAMA-3_8B_Unaligned_Alpha was SLERP merged with TheDrummer/Llama-3SOME-8B-v2. Soup 1: Merge 1 was combined with Merge 2. Final Merge: Soup 1 was SLERP merged with Nitral-Archive/Hathor_Enigmatica-L3-8B-v0.4. The final model is surprisingly coherent (although slightly more censored), which is a bit unexpected, since all the intermediate merge steps were pretty incoherent.

hathor_respawn-l3-8b-v0.8
hathor_respawn-l3-8b-v0.8

Hathor_Aleph-v0.8 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction. Hathor 0.8 is trained on 3 epochs of Private RP, STEM (Intruction/Dialogs), Opus instructons, mixture light/classical novel data, roleplaying chat pairs over llama 3 8B instruct.

llama3-8b-instruct-replete-adapted
llama3-8b-instruct-replete-adapted

Replete-Coder-llama3-8b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened. More than just a coding model! Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer!

llama-3-perky-pat-instruct-8b
llama-3-perky-pat-instruct-8b

we explore negative weight merger, and propose Orthogonalized Vector Adaptation, or OVA. This is a merge of pre-trained language models created using mergekit. "One must imagine Sisyphys happy." Task arithmetic was used to invert the intervention vector that was applied in MopeyMule, via application of negative weight -1.0. The combination of model weights (Instruct - MopeyMule) comprises an Orthogonalized Vector Adaptation that can subsequently be applied to the base Instruct model, and could in principle be applied to other models derived from fine-tuning the Instruct model. This model is meant to continue exploration of behavioral changes that can be achieved via orthogonalized steering. The result appears to be more enthusiastic and lengthy responses in chat, though it is also clear that the merged model has some unhealed damage. Built with Meta Llama 3.

l3-uncen-merger-omelette-rp-v0.2-8b
l3-uncen-merger-omelette-rp-v0.2-8b

L3-Uncen-Merger-Omelette-RP-v0.2-8B is a merge of the following models using LazyMergekit: Sao10K/L3-8B-Stheno-v3.2 Casual-Autopsy/L3-Umbral-Mind-RP-v1.0-8B bluuwhale/L3-SthenoMaidBlackroot-8B-V1 Cas-Warehouse/Llama-3-Mopeyfied-Psychology-v2 migtissera/Llama-3-8B-Synthia-v3.5 tannedbum/L3-Nymeria-Maid-8B Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B tannedbum/L3-Nymeria-8B ChaoticNeutrals/Hathor_RP-v.01-L3-8B cgato/L3-TheSpice-8b-v0.8.3 Sao10K/L3-8B-Stheno-v3.1 Nitral-AI/Hathor_Stable-v0.2-L3-8B aifeifei798/llama3-8B-DarkIdol-1.0 ChaoticNeutrals/Poppy_Porpoise-1.4-L3-8B ResplendentAI/Nymph_8B

nymph_8b-i1
nymph_8b-i1

Model card: Nymph is the culmination of everything I have learned with the T-series project. This model aims to be a unique and full featured RP juggernaut. The finetune incorporates 1.6 Million tokens of RP data sourced from Bluemoon, FreedomRP, Aesir-Preview, and Claude Opus logs. I made sure to use the multi-turn sharegpt datasets this time instead of alpaca conversions. I have also included three of my personal datasets. The final touch is an ORPO based upon Openhermes Roleplay preferences.

l3-ms-astoria-8b
l3-ms-astoria-8b

This is a merge of pre-trained language models created using mergekit. Merge Method This model was merged using the Model Stock merge method using failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 as a base. Models Merged The following models were included in the merge: ProbeMedicalYonseiMAILab/medllama3-v20 migtissera/Tess-2.0-Llama-3-8B Cas-Warehouse/Llama-3-Psychology-LoRA-Stock-8B TheSkullery/llama-3-cat-8b-instruct-v1

halomaidrp-v1.33-15b-l3-i1
halomaidrp-v1.33-15b-l3-i1

This is the third iteration "Emerald" of the final four and the one I liked the most. It has had limited testing though, but seems relatively decent. Better than 8B at least. This is a merge of pre-trained language models created using mergekit. The following models were included in the merge: grimjim/Llama-3-Instruct-abliteration-LoRA-8B UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3 NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS maldv/llama-3-fantasy-writer-8b tokyotech-llm/Llama-3-Swallow-8B-v0.1 Sao10K/L3-8B-Stheno-v3.2 ZeusLabs/L3-Aethora-15B-V2 Nitral-AI/Hathor_Respawn-L3-8B-v0.8 Blackroot/Llama-3-8B-Abomination-LORA

llama-3-patronus-lynx-70b-instruct
llama-3-patronus-lynx-70b-instruct

Lynx is an open-source hallucination evaluation model. Patronus-Lynx-70B-Instruct was trained on a mix of datasets including CovidQA, PubmedQA, DROP, RAGTruth. The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens. Model

llamax3-8b-alpaca
llamax3-8b-alpaca

LLaMAX is a language model with powerful multilingual capabilities without loss instruction-following capabilities. We collected extensive training sets in 102 languages for continued pre-training of Llama2 and leveraged the English instruction fine-tuning dataset, Alpaca, to fine-tune its instruction-following capabilities. LLaMAX supports translation between more than 100 languages, surpassing the performance of similarly scaled LLMs. Supported Languages Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)

llamax3-8b
llamax3-8b

LLaMAX is a language model with powerful multilingual capabilities without loss instruction-following capabilities. We collected extensive training sets in 102 languages for continued pre-training of Llama2 and leveraged the English instruction fine-tuning dataset, Alpaca, to fine-tune its instruction-following capabilities. LLaMAX supports translation between more than 100 languages, surpassing the performance of similarly scaled LLMs. Supported Languages Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)

arliai-llama-3-8b-dolfin-v0.5
arliai-llama-3-8b-dolfin-v0.5

Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct This is a fine tune using an improved Dolphin and WizardLM dataset intended to make the model follow instructions better and refuse less. OpenLLM Benchmark: Training: 2048 sequence length since the dataset has an average length of under 1000 tokens, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine. Training duration is around 2 days on 2xRTX 3090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.

llama-3-ezo-8b-common-it
llama-3-ezo-8b-common-it

Based on meta-llama/Meta-Llama-3-8B-Instruct, it has been enhanced for Japanese usage through additional pre-training and instruction tuning. (Built with Meta Llama3) This model is based on Llama-3-8B-Instruct and is subject to the Llama-3 Terms of Use. For detailed information, please refer to the official Llama-3 license page. このモデルはLlama-3-8B-Instructをベースにしており、Llama-3の利用規約に従います。詳細については、Llama-3の公式ライセンスページをご参照ください。

l3-8b-niitama-v1
l3-8b-niitama-v1

Niitama on Horde

l3-8b-niitama-v1-i1
l3-8b-niitama-v1-i1

Niitama on Horde (iMatrix quants)

una-thepitbull-21.4b-v2
una-thepitbull-21.4b-v2

Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0 UNA - ThePitbull 21.4B v2

helpingai-9b
helpingai-9b

HelpingAI-9B is a large language model designed for emotionally intelligent conversational interactions. It is trained to engage users with empathy, understanding, and supportive dialogue across a wide range of topics and contexts. The model aims to provide a supportive AI companion that can attune to users' emotional states and communicative needs.

llama-3-hercules-5.0-8b
llama-3-hercules-5.0-8b

Llama-3-Hercules-5.0-8B is a fine-tuned language model derived from Llama-3-8B. It is specifically designed to excel in instruction following, function calls, and conversational interactions across various scientific and technical domains.

l3-15b-mythicalmaid-t0.0001
l3-15b-mythicalmaid-t0.0001

Llama-3-15B-MythicalMaid-t0.0001 A merge of the following models using a custom NearSwap(t0.0001) algorithm (inverted): ZeusLabs/L3-Aethora-15B-V2 v000000/HaloMaidRP-v1.33-15B-L3 With ZeusLabs/L3-Aethora-15B-V2 as the base model. This merge was inverted compared to "L3-15B-EtherealMaid-t0.0001".

l3-15b-etherealmaid-t0.0001-i1
l3-15b-etherealmaid-t0.0001-i1

Llama-3-15B-EtherealMaid-t0.0001 A merge of the following models using a custom NearSwap(t0.0001) algorithm: v000000/HaloMaidRP-v1.33-15B-L3 ZeusLabs/L3-Aethora-15B-V2 With v000000/HaloMaidRP-v1.33-15B-L3 as the base model.

l3-8b-celeste-v1
l3-8b-celeste-v1

Trained on LLaMA 3 8B Instruct at 8K context using Reddit Writing Prompts, Opus 15K Instruct an c2 logs cleaned. This is a roleplay model any instruction following capabilities outside roleplay contexts are coincidental.

l3-8b-celeste-v1.2
l3-8b-celeste-v1.2

Trained on LLaMA 3 8B Instruct at 8K context using Reddit Writing Prompts, Opus 15K Instruct an c2 logs cleaned. This is a roleplay model any instruction following capabilities outside roleplay contexts are coincidental.

llama-3-tulu-2-8b-i1
llama-3-tulu-2-8b-i1

Tulu is a series of language models that are trained to act as helpful assistants. Llama 3 Tulu V2 8B is a fine-tuned version of Llama 3 that was trained on a mix of publicly available, synthetic and human datasets.

llama-3-tulu-2-dpo-70b-i1
llama-3-tulu-2-dpo-70b-i1

Tulu is a series of language models that are trained to act as helpful assistants. Llama 3 Tulu V2 8B is a fine-tuned version of Llama 3 that was trained on a mix of publicly available, synthetic and human datasets.

suzume-llama-3-8b-multilingual-orpo-borda-top25
suzume-llama-3-8b-multilingual-orpo-borda-top25

This is Suzume ORPO, an ORPO trained fine-tune of the lightblue/suzume-llama-3-8B-multilingual model using our lightblue/mitsu dataset. We have trained several versions of this model using ORPO and so recommend that you use the best performing model from our tests, lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half. Note that this model has a non-commerical license as we used the Command R and Command R+ models to generate our training data for this model (lightblue/mitsu). We are currently working on a developing a commerically usable model, so stay tuned for that!

calme-2.4-llama3-70b
calme-2.4-llama3-70b

This model is a fine-tune (DPO) of meta-llama/Meta-Llama-3-70B-Instruct model.

command-r-v01:q1_s
command-r-v01:q1_s

C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weights optimized for a variety of use cases including reasoning, summarization, and question answering. Command-R has the capability for multilingual generation evaluated in 10 languages and highly performant RAG capabilities.

aya-23-8b
aya-23-8b

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. Aya 23 focuses on pairing a highly performant pre-trained Command family of models with the recently released Aya Collection. The result is a powerful multilingual large language model serving 23 languages. This model card corresponds to the 8-billion version of the Aya 23 model. We also released a 35-billion version which you can find here.

aya-23-35b
aya-23-35b

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. Aya 23 focuses on pairing a highly performant pre-trained Command family of models with the recently released Aya Collection. The result is a powerful multilingual large language model serving 23 languages. This model card corresponds to the 8-billion version of the Aya 23 model. We also released a 35-billion version which you can find here.

phi-2-chat:Q8_0
phi-2-chat:Q8_0

Phi-2 fine-tuned by the OpenHermes 2.5 dataset optimised for multi-turn conversation and character impersonation. The dataset has been pre-processed by doing the following: - remove all refusals - remove any mention of AI assistant - split any multi-turn dialog generated in the dataset into multi-turn conversations records - added nfsw generated conversations from the Teatime dataset Developed by: l3utterfly Funded by: Layla Network Model type: Phi Language(s) (NLP): English License: MIT Finetuned from model: Phi-2

phi-2-chat
phi-2-chat

Phi-2 fine-tuned by the OpenHermes 2.5 dataset optimised for multi-turn conversation and character impersonation. The dataset has been pre-processed by doing the following: - remove all refusals - remove any mention of AI assistant - split any multi-turn dialog generated in the dataset into multi-turn conversations records - added nfsw generated conversations from the Teatime dataset Developed by: l3utterfly Funded by: Layla Network Model type: Phi Language(s) (NLP): English License: MIT Finetuned from model: Phi-2

phi-2-orange
phi-2-orange

A two-step finetune of Phi-2, with a bit of zest. There is an updated model at rhysjones/phi-2-orange-v2 which has higher evals, if you wish to test.

internlm2_5-7b-chat-1m
internlm2_5-7b-chat-1m

InternLM2.5 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics: Outstanding reasoning capability: State-of-the-art performance on Math reasoning, surpassing models like Llama3 and Gemma2-9B. 1M Context window: Nearly perfect at finding needles in the haystack with 1M-long context, with leading performance on long-context tasks like LongBench. Try it with LMDeploy for 1M-context inference and a file chat demo. Stronger tool use: InternLM2.5 supports gathering information from more than 100 web pages, corresponding implementation will be released in Lagent soon. InternLM2.5 has better tool utilization-related capabilities in instruction following, tool selection and reflection. See examples.