The llm server refers to a program that change LLM model into a web service. There’s widely known program that serve this purpose, namely Ollama, vLLM and llama.cpp. These three program can give the standard API interface for the LLM models.

OpenAI Standard API

Ollama

vLLM

llama.cpp