华为云AI开发平台ModelArts在lite资源池上使用Snt9B完成推理任务_云淘科技
场景描述
本案例介绍如何在Snt9B上使用deployment部署在线在推理服务。
操作步骤
拉取镜像。本测试镜像为bert_pretrain_mindspore:v1,已经把测试数据和代码打进镜像中。
docker pull swr.cn-southwest-2.myhuaweicloud.com/os-public-repo/bert_pretrain_mindspore:v1 docker tag swr.cn-southwest-2.myhuaweicloud.com/os-public-repo/bert_pretrain_mindspore:v1 bert_pretrain_mindspore:v1
在主机上新建config.yaml文件。
config.yaml文件用于配置pod,本示例中使用sleep命令启动pod,便于进入pod调试。您也可以修改command为对应的任务启动命令(如“python inference.py”),任务会在启动容器后执行。
config.yaml内容如下:
apiVersion: apps/v1 kind: Deployment metadata: name: yourapp labels: app: infers spec: replicas: 1 selector: matchLabels: app: infers template: metadata: labels: app: infers spec: schedulerName: volcano nodeSelector: accelerator/huawei-npu: ascend-1980 containers: - image: bert_pretrain_mindspore:v1 # Inference image name imagePullPolicy: IfNotPresent name: mindspore command: - "sleep" - "1000000000000000000" resources: requests: huawei.com/ascend-1980: "1" # 需求卡数,key保持不变。Number of required NPUs. The maximum value is 16. You can add lines below to configure resources such as memory and CPU. limits: huawei.com/ascend-1980: "1" # 限制卡数,key保持不变。The value must be consistent with that in requests. volumeMounts: - name: ascend-driver #驱动挂载,保持不动 mountPath: /usr/local/Ascend/driver - name: ascend-add-ons #驱动挂载,保持不动 mountPath: /usr/local/Ascend/add-ons - name: hccn #驱动hccn配置,保持不动 mountPath: /etc/hccn.conf - name: npu-smi #npu-smi mountPath: /usr/local/bin/npu-smi - name: localtime #The container time must be the same as the host time. mountPath: /etc/localtime volumes: - name: ascend-driver hostPath: path: /usr/local/Ascend/driver - name: ascend-add-ons hostPath: path: /usr/local/Ascend/add-ons - name: hccn hostPath: path: /etc/hccn.conf - name: npu-smi hostPath: path: /usr/local/bin/npu-smi - name: localtime hostPath: path: /etc/localtime
根据config.yaml创建pod。
kubectl apply -f config.yaml
检查pod启动情况,执行下述命令。如果显示“1/1 running”状态代表启动成功。
kubectl get pod -A
进入容器,{pod_name}替换为您的pod名字(get pod中显示的名字),{namespace}替换为您的命名空间(默认为default)。
kubectl exec -it {pod_name} bash -n {namespace}
激活conda模式。
su - ma-user //切换用户身份 conda activate MindSpore //激活 MindSpore环境
创建测试代码test.py。
from flask import Flask, request import json app = Flask(__name__) @app.route('/greet', methods=['POST']) def say_hello_func(): print("----------- in hello func ----------") data = json.loads(request.get_data(as_text=True)) print(data) username = data['name'] rsp_msg = 'Hello, {}!'.format(username) return json.dumps({"response":rsp_msg}, indent=4) @app.route('/goodbye', methods=['GET']) def say_goodbye_func(): print("----------- in goodbye func ----------") return ' Goodbye! ' @app.route('/', methods=['POST']) def default_func(): print("----------- in default func ----------") data = json.loads(request.get_data(as_text=True)) return ' called default func ! {} '.format(str(data)) # host must be "0.0.0.0", port must be 8080 if __name__ == '__main__': app.run(host="0.0.0.0", port=8080)
执行代码,执行后如下图所示,会部署一个在线服务,该容器即为服务端。
python test.py
图1 部署在线服务
在XShell中新开一个终端,参考步骤5~7进入容器,该容器为客户端。执行以下命令验证自定义镜像的三个API接口功能。当显示如图所示时,即可调用服务成功。
curl -X POST -H "Content-Type: application/json" --data '{"name":"Tom"}' 127.0.0.1:8080/ curl -X POST -H "Content-Type: application/json" --data '{"name":"Tom"}' 127.0.0.1:8080/greet curl -X GET 127.0.0.1:8080/goodbye
图2 访问在线服务
limit/request配置cpu和内存大小,已知单节点Snt9B机器为:8张Snt9B卡+192u1536g,请合理规划,避免cpu和内存限制过小引起任务无法正常运行。
父主题: k8s Cluster资源使用
同意关联代理商云淘科技,购买华为云产品更优惠(QQ 78315851)
内容没看懂? 不太想学习?想快速解决? 有偿解决: 联系专家