:py:mod:`fastchat.serve.cacheflow_worker` ========================================= .. py:module:: fastchat.serve.cacheflow_worker .. autoapi-nested-parse:: A model worker executes the model based on Cacheflow. Install Cacheflow first. Then, assuming controller is live: 1. ray start --head 2. python3 -m fastchat.serve.cacheflow_worker --model-path path_to_vicuna launch Gradio: 3. python3 -m fastchat.serve.gradio_web_server --concurrency-count 10000