Deploy Hugging Face Pruning Model on CPU
This tutorial shows how to use a pruned model (in this case, the model is PruneBert from Hugging Face), and use TVM to take advantage of model sparse support for acceleration.
This tutorial shows how to use a pruned model (in this case, the model is PruneBert from Hugging Face), and use TVM to take advantage of model sparse support for acceleration.