Inferless

stable-diffusion-3.5-large Public
8B model, excels in producing high-quality, detailed images up to 1 megapixel in resolution. <metadata> gpu: A100 | collections: ["Diffusers"] </metadata>

Python 0 0 0 0 Updated Apr 21, 2025
phi-4-GGUF Public template
A 14B model optimized in GGUF format for efficient inference, designed to excel in complex reasoning tasks. <metadata> gpu: A100 | collections: ["llama.cpp","GGUF"] </metadata>

Python 0 6 0 0 Updated Apr 19, 2025
tinyllama-1-1b-chat-v1-0 Public template
A chat model fine-tuned on TinyLlama, a compact 1.1B Llama model pretrained on 3 trillion tokens. <metadata> gpu: T4 | collections: ["vLLM"] </metadata>

Python 1 2 0 0 Updated Apr 18, 2025
llama-2-13b-chat-hf Public template
A 13B model fine-tuned with reinforcement learning from human feedback, part of Meta’s Llama 2 family for dialogue tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

Python 0 1 0 0 Updated Apr 18, 2025
falcon-7b-instruct Public template
A 7B instruction-tuned language model that excels in following detailed prompts and effectively performing a wide variety of natural language processing tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

Python 1 2 0 0 Updated Apr 18, 2025
vicuna-7b-8k Public template
A GPTQ‑quantized variant of Vicuna 7B v1.3, optimized for conversational AI and instruction‑following with efficient, robust performance. <metadata> gpu: T4 | collections:["GPTQ"] </metadata>

Python 0 2 0 0 Updated Apr 18, 2025
vicuna-13b-8k Public template
A GPTQ‑quantized, 13‑billion‑parameter uncensored language model with an extended 8K context window, designed for dynamic, high‑performance conversational tasks. <metadata> gpu: T4 | collections: ["GPTQ"] </metadata>

Python 0 1 0 0 Updated Apr 18, 2025
codellama-7b Public template
A 7B-parameter, Python-specialized model for lightweight, efficient code generation and comprehension. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

Python 1 6 0 0 Updated Apr 18, 2025
meditron-7b-gptq Public template
An AWQ-quantized open-source medical LLM designed for exam question answering, differential diagnosis support, and providing comprehensive disease, symptom, cause, and treatment information. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

Python 0 0 0 0 Updated Apr 18, 2025
openhermes-2-5-mistral-7b Public template
A quantized model fine-tuned for rapid, efficient, and robust conversational and instruction tasks. <metadata> gpu: A100 | collections: ["vLLM","AWQ"] </metadata>

Python 3 0 0 0 Updated Apr 18, 2025

View all repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inferless

Popular repositories Loading

Repositories

People

Top languages

Most used topics