LLM Serving
Working on LLMs often entails us to conduct a demo for real-time test. Sometimes we have to set things up so that co-worker can play with our model to find out the issues there. An eassy way is to use Flask.
import flask
app = flask.Flask(__name__)
@app.route('/')
def index():
return "<h3>My LLM Playground</h3>"
Start the Server
Start the server, we can run
ApiServicePort=xxxx python3 serve.py
Front-End
If we use flask render_template
to provide the front end, then we can use the following to ways to launch the app,
# method 1
flask run
# method 2
python3 app.py
Another way is to use streamlit
. Streamlit is an open-source Python library that allows developers to create web applications for data science and machine learning projects with minimal effort. It is designed to simplify the process of turning data scripts into shareable web apps, enabling users to interact with data and models through a web browser.
If we use streamlit
, we can run with
streamlit run app.py
Usually we first star the serve and specific the port to listening on. Then pull up the front end page.
The page will be like the following, simple and easy!!
LLM Playground
References
[1] Openplayground
...