зеркало из
https://github.com/iharh/notes.git
synced 2025-10-29 20:56:06 +02:00
83 строки
3.3 KiB
Plaintext
83 строки
3.3 KiB
Plaintext
Usage:
|
|
ollama [flags]
|
|
ollama [command]
|
|
|
|
Available Commands:
|
|
serve/start
|
|
Start ollama
|
|
create Create a model from a Modelfile
|
|
show Show information for a model
|
|
run Run a model
|
|
llama3.2:3b
|
|
echo "what is the two multiplied by three?" | ollama run llama3.2:3b
|
|
stop Stop a running model
|
|
pull Pull a model from a registry
|
|
llama3.2:3b
|
|
push Push a model to a registry
|
|
list List models
|
|
ps List running models
|
|
cp Copy a model
|
|
rm Remove a model
|
|
help Help about any command
|
|
|
|
Flags:
|
|
-h, --help help for ollama
|
|
-v, --version Show version information
|
|
|
|
Environment Variables:
|
|
OLLAMA_DEBUG Show additional debug information (e.g. OLLAMA_DEBUG=1)
|
|
OLLAMA_HOST IP Address for the ollama server (default 127.0.0.1:11434)
|
|
OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default "5m")
|
|
OLLAMA_MAX_LOADED_MODELS Maximum number of loaded models per GPU
|
|
OLLAMA_MAX_QUEUE Maximum number of queued requests
|
|
OLLAMA_MODELS The path to the models directory
|
|
OLLAMA_NUM_PARALLEL Maximum number of parallel requests
|
|
OLLAMA_NOPRUNE Do not prune model blobs on startup
|
|
OLLAMA_ORIGINS A comma separated list of allowed origins
|
|
OLLAMA_SCHED_SPREAD Always schedule model across all GPUs
|
|
OLLAMA_TMPDIR Location for temporary files
|
|
OLLAMA_FLASH_ATTENTION Enabled flash attention
|
|
OLLAMA_LLM_LIBRARY Set LLM library to bypass autodetection
|
|
OLLAMA_GPU_OVERHEAD Reserve a portion of VRAM per GPU (bytes)
|
|
OLLAMA_LOAD_TIMEOUT How long to allow model loads to stall before giving up (default "5m")
|
|
|
|
Available keyboard shortcuts:
|
|
Ctrl + a Move to the beginning of the line (Home)
|
|
Ctrl + e Move to the end of the line (End)
|
|
Alt + b Move back (left) one word
|
|
Alt + f Move forward (right) one word
|
|
Ctrl + k Delete the sentence after the cursor
|
|
Ctrl + u Delete the sentence before the cursor
|
|
Ctrl + w Delete the word before the cursor
|
|
|
|
Ctrl + l Clear the screen
|
|
Ctrl + c Stop the model from responding
|
|
Ctrl + d Exit ollama (/bye)
|
|
|
|
Output
|
|
|
|
Couldn't find '/home/iharh/.ollama/id_ed25519'. Generating new private key.
|
|
Your new public key is:
|
|
|
|
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIH8NPUj/JP0gIzX0Wfy3tuezgOyd5lypq59wh9Kr9Vff
|
|
|
|
|
|
time=2025-07-05T08:30:15.765+03:00 level=INFO source=.:0 msg="Server listening on 127.0.0.1:45349"
|
|
llama_model_loader: loaded meta data with 30 key-value pairs and 255 tensors from ~/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff (version GGUF V3 (latest))
|
|
...
|
|
time=2025-07-05T08:30:17.518+03:00 level=INFO source=server.go:601 msg="llama runner started in 1.76 seconds"
|
|
[GIN] 2025/07/05 - 08:30:17 | 200 | 1.79838588s | 127.0.0.1 | POST "/api/generate"
|
|
[GIN] 2025/07/05 - 08:31:07 | 200 | 965.070709ms | 127.0.0.1 | POST "/api/chat"
|
|
[GIN] 2025/07/05 - 08:31:16 | 200 | 15.374323ms | 127.0.0.1 | POST "/api/show"
|
|
|
|
|
|
|
|
|
|
1. start
|
|
2. pull
|
|
3. run
|
|
|
|
/api/tags
|
|
|
|
curl 'http://localhost:11434/api/ps'
|