Yolov5 running on TorchServe (GPU compatible) !

This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch library) is a flexible and easy to use tool for serving deep learning models exported from PyTorch).

You just need to pass a yolov5 weights file (.pt) in the ressources folder and it will deploy a http server, ready to serve predictions.

Setting up the docker image

Build the torchserve image locally if using a GPU (error with the dockerhub one): Build the image torchserve locally for GPU before running this (cf github torchserve) https://github.com/pytorch/serve/tree/master/docker

Note: for CPU only, you can take the image from docker-hub directly, it should work fine.

After trainning a yolo v5 model on COLAB, move the weights.pt to the ressources folder and modify the name of your weights.pt file in the Dockerfile (line 20 and line 22)
Modify "index_to_name.json" to match your classes.
(Optional) you can modify the batch size in the Dockerfile (line 20) and in the torchserve_handler.py (line 18)
The docker image is ready to be built and used:

docker build . -t "your_tag:your_version"

docker run "your_tag:your_version"

Getting predictions

Once the dockerimage is running, you can send POST requests to: localhost:8080/predictions/my_model (with my_model being the name of your model).

The handler in this project expect the inputs images to be sent via a Multipart form with a "key/value" form having in the keys the strings "img"+[index] and in the values, the bytes of each images.

Example:

For a batch_size of 5, we would have the following in our Multipart form request:

"img1": [bytes_of_the_1st_image],
"img2": [bytes_of_the_2st_image],
"img3": [bytes_of_the_3st_image],
"img4": [bytes_of_the_4st_image],
"img5": [bytes_of_the_5st_image],

The returned json of the request contain a single list. Each i-th element of this list represent the i-th image detection results (represented by: (x1, y1, x2, y2, conf, cls))

There is a request example on the image of this Readme. Note that if there is less input images than the batch size, the rest of the inference batch will be padded with zeros inputs.

Note:

The yolov5 folder in ressources is just here to export the model to a torchscript version. (It could be optimized to keep only the export.py file)

For the docker-compose, you might have an issue with the GPU:

check that you have nvidia-docker installed
make a change in docker-compose configs to force GPU usage (there is an issue on docker-compose github open)

If you want to run with a CPU, change the line 'cuda:0' to 'cpu' in the export.py file of yolov5

TO DO:

For now I only tested it with GPU as this is my usecase, but later I'll try to automate the build so that it's easier to switch to CPU
The whole repo of yolov5 is in the ressource folder, but only the export is used, I will refactor to keep only the export part (a bit tricky with dependencies)

torch-model-archiver --model-name fallDown --version 0.1 --serialized-file /data/share/imageAlgorithm/zhangcheng/code/yolov5/runs/train/exp5/weights/best.torchscript.pt --handler /data/share/imageAlgorithm/zhangcheng/code/yolov5/utils/torchserve_handler.py --extra-files /data/share/imageAlgorithm/zhangcheng/code/yolov5/runs/train/exp5/weights/index_to_name.json,/data/share/imageAlgorithm/zhangcheng/code/yolov5/utils/torchserve_handler.py --export-path /data/share/imageAlgorithm/zhangcheng/code/yolov5/runs/train/exp5/weights/

docker run -itd --gpus '"device=5,6"' -p 8080:8080 -p 8081:8081 -p 8082:8082 -p 7070:7070 -p 7071:7071 --name fallDetect -v /data/share/imageAlgorithm/zhangcheng/code/yolov5/runs/train/exp5/weights/:/home/model-server/model-store/ pytorch/torchserve /bin/bash

torchserve --start --ncs --model-store /home/model-server/model-store --models ./model-store/fallDetect.mar

model-server@aec403759fcc:~$ torchserve --start --ncs --model-store /home/model-server/model-store --models model-store/fallDown.mar
model-server@aec403759fcc:~$ 2021-03-16 02:37:49,517 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.3.0
TS Home: /home/venv/lib/python3.6/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 2
Number of CPUs: 24
Max heap size: 30688 M
Python executable: /home/venv/bin/python3
Config file: config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://0.0.0.0:8082
Model Store: /home/model-server/model-store
Initial Models: model-store/fallDown.mar
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 2
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Metrics report format: prometheus
Enable metrics API: true
2021-03-16 02:37:49,545 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: model-store/fallDown.mar
2021-03-16 02:37:50,042 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag 040e4e47c1da44c290d18ac9fe5c0b62
2021-03-16 02:37:50,058 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 0.1 for model fallDown
2021-03-16 02:37:50,058 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 0.1 for model fallDown
2021-03-16 02:37:50,058 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model fallDown loaded.
2021-03-16 02:37:50,058 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: fallDown, count: 2
2021-03-16 02:37:50,075 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2021-03-16 02:37:50,166 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080
2021-03-16 02:37:50,166 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9001
2021-03-16 02:37:50,167 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2021-03-16 02:37:50,168 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]9127
2021-03-16 02:37:50,168 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2021-03-16 02:37:50,168 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2021-03-16 02:37:50,169 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9001-fallDown_0.1 State change null -> WORKER_STARTED
2021-03-16 02:37:50,170 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081
2021-03-16 02:37:50,170 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2021-03-16 02:37:50,171 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://0.0.0.0:8082
2021-03-16 02:37:50,176 [INFO ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001
2021-03-16 02:37:50,191 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9001.
2021-03-16 02:37:50,195 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9000
2021-03-16 02:37:50,196 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]9128
2021-03-16 02:37:50,197 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2021-03-16 02:37:50,198 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9000-fallDown_0.1 State change null -> WORKER_STARTED
2021-03-16 02:37:50,198 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2021-03-16 02:37:50,199 [INFO ] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2021-03-16 02:37:50,203 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
Model server started.
2021-03-16 02:37:50,432 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:100.0|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,440 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:428.0089874267578|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,441 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:71.74675369262695|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,441 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:14.4|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,442 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:117864.27734375|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,442 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:9002.609375|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,443 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:8.2|#Level:Host|#hostname:aec403759fcc,timestamp:1615862270
2021-03-16 02:37:50,996 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2021-03-16 02:37:50,996 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2021-03-16 02:37:50,997 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-03-16 02:37:50,997 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-03-16 02:37:50,997 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 182, in <module>
2021-03-16 02:37:50,997 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 182, in <module>
2021-03-16 02:37:50,997 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     worker.run_server()
2021-03-16 02:37:50,997 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     worker.run_server()
2021-03-16 02:37:50,997 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 154, in run_server
2021-03-16 02:37:50,998 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 154, in run_server
2021-03-16 02:37:50,998 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.handle_connection(cl_socket)
2021-03-16 02:37:50,998 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.handle_connection(cl_socket)
2021-03-16 02:37:50,998 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 116, in handle_connection
2021-03-16 02:37:50,998 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 116, in handle_connection
2021-03-16 02:37:50,999 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service, result, code = self.load_model(msg)
2021-03-16 02:37:50,998 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service, result, code = self.load_model(msg)
2021-03-16 02:37:50,999 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 89, in load_model
2021-03-16 02:37:50,999 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 89, in load_model
2021-03-16 02:37:50,999 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service = model_loader.load(model_name, model_dir, handler, gpu, batch_size, envelope)
2021-03-16 02:37:50,999 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service = model_loader.load(model_name, model_dir, handler, gpu, batch_size, envelope)
2021-03-16 02:37:51,000 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_loader.py", line 104, in load
2021-03-16 02:37:51,000 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_loader.py", line 104, in load
2021-03-16 02:37:51,000 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     initialize_fn(service.context)
2021-03-16 02:37:51,000 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     initialize_fn(service.context)
2021-03-16 02:37:51,001 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/torch_handler/base_handler.py", line 79, in initialize
2021-03-16 02:37:51,001 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/torch_handler/base_handler.py", line 79, in initialize
2021-03-16 02:37:51,001 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.mapping = load_label_mapping(mapping_file_path)
2021-03-16 02:37:51,001 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.mapping = load_label_mapping(mapping_file_path)
2021-03-16 02:37:51,002 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/utils/util.py", line 40, in load_label_mapping
2021-03-16 02:37:51,002 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/utils/util.py", line 40, in load_label_mapping
2021-03-16 02:37:51,002 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     mapping = json.load(f)
2021-03-16 02:37:51,002 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     mapping = json.load(f)
2021-03-16 02:37:51,003 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 299, in load
2021-03-16 02:37:51,003 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 299, in load
2021-03-16 02:37:51,003 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
2021-03-16 02:37:51,003 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
2021-03-16 02:37:51,003 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
2021-03-16 02:37:51,003 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
2021-03-16 02:37:51,003 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return _default_decoder.decode(s)
2021-03-16 02:37:51,004 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return _default_decoder.decode(s)
2021-03-16 02:37:51,004 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
2021-03-16 02:37:51,004 [INFO ] epollEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2021-03-16 02:37:51,004 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
2021-03-16 02:37:51,004 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_STARTED
2021-03-16 02:37:51,005 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2021-03-16 02:37:51,004 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2021-03-16 02:37:51,005 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2021-03-16 02:37:51,005 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
2021-03-16 02:37:51,005 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2021-03-16 02:37:51,005 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
2021-03-16 02:37:51,006 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.scan_once(s, idx)
2021-03-16 02:37:51,006 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.scan_once(s, idx)
2021-03-16 02:37:51,006 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 1 (char 16)
2021-03-16 02:37:51,006 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
        at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
        at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:188)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2021-03-16 02:37:51,006 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
        at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
        at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:188)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2021-03-16 02:37:51,009 [WARN ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: fallDown, error: Worker died.
2021-03-16 02:37:51,007 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 1 (char 16)
2021-03-16 02:37:51,010 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9001-fallDown_0.1 State change WORKER_STARTED -> WORKER_STOPPED
2021-03-16 02:37:51,010 [WARN ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-fallDown_0.1-stderr
2021-03-16 02:37:51,011 [WARN ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-fallDown_0.1-stdout
2021-03-16 02:37:51,010 [WARN ] W-9000-fallDown_0.1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: fallDown, error: Worker died.
2021-03-16 02:37:51,011 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9000-fallDown_0.1 State change WORKER_STARTED -> WORKER_STOPPED
2021-03-16 02:37:51,011 [WARN ] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-fallDown_0.1-stderr
2021-03-16 02:37:51,012 [WARN ] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-fallDown_0.1-stdout
2021-03-16 02:37:51,013 [INFO ] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.
2021-03-16 02:37:51,013 [INFO ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9001 in 1 seconds.
2021-03-16 02:37:51,022 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-fallDown_0.1-stdout
2021-03-16 02:37:51,022 [INFO ] W-9001-fallDown_0.1-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-fallDown_0.1-stderr
2021-03-16 02:37:51,027 [INFO ] W-9000-fallDown_0.1-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-fallDown_0.1-stderr
2021-03-16 02:37:51,027 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-fallDown_0.1-stdout
2021-03-16 02:37:52,139 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9001
2021-03-16 02:37:52,140 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]9198
2021-03-16 02:37:52,140 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2021-03-16 02:37:52,140 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9001-fallDown_0.1 State change WORKER_STOPPED -> WORKER_STARTED
2021-03-16 02:37:52,140 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2021-03-16 02:37:52,140 [INFO ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001
2021-03-16 02:37:52,142 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9001.
2021-03-16 02:37:52,182 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9000
2021-03-16 02:37:52,182 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]9197
2021-03-16 02:37:52,182 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2021-03-16 02:37:52,182 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9000-fallDown_0.1 State change WORKER_STOPPED -> WORKER_STARTED
2021-03-16 02:37:52,182 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2021-03-16 02:37:52,182 [INFO ] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2021-03-16 02:37:52,184 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2021-03-16 02:37:52,872 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2021-03-16 02:37:52,872 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-03-16 02:37:52,872 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 182, in <module>
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     worker.run_server()
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 154, in run_server
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.handle_connection(cl_socket)
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 116, in handle_connection
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service, result, code = self.load_model(msg)
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 89, in load_model
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service = model_loader.load(model_name, model_dir, handler, gpu, batch_size, envelope)
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_loader.py", line 104, in load
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     initialize_fn(service.context)
2021-03-16 02:37:52,873 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/torch_handler/base_handler.py", line 79, in initialize
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.mapping = load_label_mapping(mapping_file_path)
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/utils/util.py", line 40, in load_label_mapping
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     mapping = json.load(f)
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 299, in load
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return _default_decoder.decode(s)
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
2021-03-16 02:37:52,874 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.scan_once(s, idx)
2021-03-16 02:37:52,875 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 1 (char 16)
2021-03-16 02:37:52,877 [INFO ] epollEventLoopGroup-5-3 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_STARTED
2021-03-16 02:37:52,878 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2021-03-16 02:37:52,878 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
        at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
        at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:188)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2021-03-16 02:37:52,879 [WARN ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: fallDown, error: Worker died.
2021-03-16 02:37:52,879 [DEBUG] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - W-9001-fallDown_0.1 State change WORKER_STARTED -> WORKER_STOPPED
2021-03-16 02:37:52,879 [WARN ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-fallDown_0.1-stderr
2021-03-16 02:37:52,879 [WARN ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-fallDown_0.1-stdout
2021-03-16 02:37:52,880 [INFO ] W-9001-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9001 in 1 seconds.
2021-03-16 02:37:52,888 [INFO ] W-9001-fallDown_0.1-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-fallDown_0.1-stderr
2021-03-16 02:37:52,888 [INFO ] W-9001-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-fallDown_0.1-stdout
2021-03-16 02:37:52,900 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2021-03-16 02:37:52,900 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-03-16 02:37:52,900 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 182, in <module>
2021-03-16 02:37:52,900 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     worker.run_server()
2021-03-16 02:37:52,900 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 154, in run_server
2021-03-16 02:37:52,900 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.handle_connection(cl_socket)
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 116, in handle_connection
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service, result, code = self.load_model(msg)
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_service_worker.py", line 89, in load_model
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service = model_loader.load(model_name, model_dir, handler, gpu, batch_size, envelope)
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/model_loader.py", line 104, in load
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     initialize_fn(service.context)
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/torch_handler/base_handler.py", line 79, in initialize
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.mapping = load_label_mapping(mapping_file_path)
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/home/venv/lib/python3.6/site-packages/ts/utils/util.py", line 40, in load_label_mapping
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     mapping = json.load(f)
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 299, in load
2021-03-16 02:37:52,901 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return _default_decoder.decode(s)
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     obj, end = self.scan_once(s, idx)
2021-03-16 02:37:52,902 [INFO ] W-9000-fallDown_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 1 (char 16)
2021-03-16 02:37:52,905 [INFO ] epollEventLoopGroup-5-4 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2021-03-16 02:37:52,905 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2021-03-16 02:37:52,905 [DEBUG] W-9000-fallDown_0.1 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
        at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
        at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:188)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

--model-store directory not found: model_store

Hello, Thanks for this project,

I use your default dockerfile, it is built but when i run the docker i got this error --model-store directory not found: model_store

You have any idea?

opened by metempasa 8
Multipart form format

Could you provide an example of how you convert your images to bytes and how you're formatting your POST requests (specifically how to include the bytes)? As someone who's never seen any of this before, it would be really helpful! Thank you!

opened by kellansteele 3

load model failed

opened by nobody-cheng 2

we must resize the resolution of the input img to 640

The expanded size of the tensor (640) must match the existing size (576) at non-singleton dimension 2. Target sizes: [3, 640, 640]. Tensor sizes: [3, 615, 576]

opened by xun-dao 1
Finding the labels json file

It looks like there needs to be an updated index_to_name.json containing the desired labels. If we're using the off-the-shelf yolov5, where do we find this file?

opened by jevy 1
`box_iou` function is not defined in `non_max_suppression` function

https://github.com/louisoutin/yolov5_torchserve/blob/1fcbd7bee713983d4975a9a72d3e9ee8cd69c84e/ressources/torchserve_handler.py#L155

In this line, box_iou function is used. However it is not defined. It seems that while runtime, it never be reached. Nevertheless, there should be a definition for that function reference.

opened by psycoder21 0
dockerfile

EXPOSE 8080 8081,hello, i met question , can i use other port ,not 8080,8081, i cannot found 8080 8081 when i use ttp:/localhost:8080/predictions/yolov5

opened by tkone2018 0
Can we doploy this to AWS sagemaker

I've trained yolov5 and I have my custom weights, now I have to deploy it on amazon sagemaker, so I need to create an endpoint which is something that I don't know how to create, so can you help me ?

opened by ghost 0

Trouble deploying Yolov5 model in Torch Serve environment

I am trying to deploy the Yolov5 model in the Torch Serve environment but on sending the below request I am not getting any response, the request seems to get stuck

$ curl http://127.0.0.1:8080/predictions/yolov5x -T /home/atinesh/Desktop/COCO_val2014_000000562557.jpg

I have followed all the steps as mentioned in the README.md and I was able to successfully deploy the container

Note: I had to make some changes in Dockerfile and docker run command to make it work

Deployed model is running properly, on checking the model health I am getting below response

$ curl "http://localhost:8081/models/yolov5x"
[
  {
    "modelName": "yolov5x",
    "modelVersion": "1.0",
    "modelUrl": "yolov5x.mar",
    "runtime": "python",
    "minWorkers": 6,
    "maxWorkers": 6,
    "batchSize": 1,
    "maxBatchDelay": 100,
    "loadedAtStartup": true,
    "workers": [
      {
        "id": "9000",
        "startTime": "2021-11-11T09:25:16.121Z",
        "status": "READY",
        "memoryUsage": 0,
        "pid": 907,
        "gpu": false,
        "gpuUsage": "N/A"
      },
      {
        "id": "9001",
        "startTime": "2021-11-11T09:25:16.123Z",
        "status": "READY",
        "memoryUsage": 0,
        "pid": 902,
        "gpu": false,
        "gpuUsage": "N/A"
      },
      {
        "id": "9002",
        "startTime": "2021-11-11T09:25:16.123Z",
        "status": "READY",
        "memoryUsage": 0,
        "pid": 897,
        "gpu": false,
        "gpuUsage": "N/A"
      },
      {
        "id": "9003",
        "startTime": "2021-11-11T09:25:16.124Z",
        "status": "READY",
        "memoryUsage": 0,
        "pid": 882,
        "gpu": false,
        "gpuUsage": "N/A"
      },
      {
        "id": "9004",
        "startTime": "2021-11-11T09:25:16.124Z",
        "status": "READY",
        "memoryUsage": 0,
        "pid": 892,
        "gpu": false,
        "gpuUsage": "N/A"
      },
      {
        "id": "9005",
        "startTime": "2021-11-11T09:25:16.127Z",
        "status": "READY",
        "memoryUsage": 0,
        "pid": 887,
        "gpu": false,
        "gpuUsage": "N/A"
      }
    ]
  }
]

Yolov5 (XLarge) model is trained on custom COCO dataset to detect 2 objects person & bicycle, below is the link of the trained model file

yolov5x.pt

Image used for Inference: COCO_val2014_000000562557.jpg

I have tried two different handler files handler #1 provided by you @louisoutin and handler #2 provided by @joek13 but the same issue persists

It seems like latest Yolov5 code is creating some problem

opened by atinesh-s 0

If should I resize an image to a square

when I use torchserve, I must change my image to a square like(640 * 640), but if I use my model directly, I can inference image like (640* 516) or (640 * any size).

opened by xun-dao 0

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Related tags

Overview

Yolov5 running on TorchServe (GPU compatible) !

Setting up the docker image

Getting predictions

Example:

Note:

Comments

--model-store directory not found: model_store

Multipart form format

load model failed

we must resize the resolution of the input img to 640

Finding the labels json file

`box_iou` function is not defined in `non_max_suppression` function

dockerfile

Can we doploy this to AWS sagemaker

Trouble deploying Yolov5 model in Torch Serve environment

If should I resize an image to a square

Owner

🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Multi-task yolov5 with detection and segmentation based on yolov5

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios

Yolov5-lite - Minimal PyTorch implementation of YOLOv5

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

Run Effective Large Batch Contrastive Learning on Limited Memory GPU

Contra is a lightweight, production ready Tensorflow alternative for solving time series prediction challenges with AI

TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline.

GrabGpu_py: a scripts for grab gpu when gpu is free

A Python training and inference implementation of Yolov5 helmet detection in Jetson Xavier nx and Jetson nano

A minimal implementation of face-detection models using flask, gunicorn, nginx, docker, and docker-compose

YoloV5 implemented by TensorFlow2 , with support for training, evaluation and inference.

A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.

Yolov5 deepsort inference，使用YOLOv5+Deepsort实现车辆行人追踪和计数，代码封装成一个Detector类，更容易嵌入到自己的项目中

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.