return to table of content

Quantized Llama models with increased speed and a reduced memory footprint