[Nanny-level tutorial | YOLOv8 improvements] [5] Both accuracy and speed are improved, using FasterNet to replace the backbone network
[Nanny-level tutorial | YOLOv8 improvements] [5] Both accuracy and speed are improved, using FasterNet to replace the backbone network
"Blogger Profile"
Hello friends, my name is A Xu. Focus on sharing research related to artificial intelligence, AIGC, python, and computer vision.
✌For more learning resources, you can follow Gong-Zhong-hao: [Axu Algorithm and Machine Learning] to learn and communicate together~
👍Thank you friends for liking and following!
"------Classic Recommendations from Past Issues------"
1. AI application software development practical column [link]
2. Machine Learning Practical Column [Link] , has been updated for 31 issues. Welcome to pay attention and will continue to be updated~~
3. Deep Learning [Pytorch] Column [Link]
4.[Stable Diffusion Painting Series] Column [Link]
"------text------"
Preface
Paper publication time: 2023.03.07
github address: https://github.com/JierunChen/[FasterNet](/search?q=FasterNet)
paper address: https://export.arxiv.org/pdf/2303.03667v1.pdf
The article proposes a method novel local convolution (PConv)
that extracts spatial features more efficiently by cutting redundant calculations and memory accesses , and on the data set tested by the author to achieve both accuracy and speed
. This article details how to FasterNet
replace its backbone network with yolov8, and use the modified yolov8 for target detection training and inference
. This article provides all the source code for free for your friends to learn and refer to. If you need it, you can download it by yourself at the end of the If you need it, you can download it by yourself at the end of the article.
The version of ultralytics used in this improvement is: ultralytics == 8.0.227
Table of contents
- Preface
- 1. Introduction to FasterNet
-
- 1.1 Network structure
- 1.2 Performance comparison
- 2. YOLOv8 main steps replacement
-
- Comparison before and after YOLOv8 network structure
- Define FasterNet related classes
- Modify specified file
- 3. Load the configuration file and train
- 4. Model reasoning
- [Source code available for free]
- Conclusion
1. Introduction to FasterNet
Abstract: In order to design fast neural networks, many research efforts have been focused on reducing the number of floating point operations (FLOPs). However, we observe that this reduction in FLOPs does not necessarily result in a similar level of latency reduction. This is primarily due to inefficient This is primarily due to inefficient floating point operations per second (FLOPS). To achieve faster networks, we revisit popular operators and demonstrate that this low FLOPS is mainly due to the frequent memory access of operators, especially deep convolutions. Therefore,
we propose a novel localized convolution (PConv) that extracts spatial features more efficiently by cutting down redundant computations and memory access.
at our PConv, we further propose FasterNet, a new family of neural networks that achieves higher operating speeds than other networks on a wide range of devices without compromising accuracy on a variety of vision tasks. For example, on ImageNet-1k, ``our small FasterNet-T0 outperforms MobileViT-XXS by 3x, 1x, and 1x on GPUs, CPUs, and ARM processors, respectively. XXS on GPU, CPU, and ARM processors by 3.1x, 3.1x, and 2.5x, respectively, while improving accuracy by 2.9%. Our large-scale
FasterNet-L achieves an impressive 83.5% top-1 accuracy`, on par with the emerging Swin-B, while achieving 49% higher inference throughput on the GPU and 42% less compute time on the CPU.
The main highlights of the paper are as follows:
• Emphasized the importance of increasing floating point operations per second (FLOPS), not just reducing FLOPs, in order to achieve faster neural networks.
• Introduced a simple yet fast and efficient operator called PConv, which has high potential to replace the existing preferred option, namely depthwise convolution (DWConv).
• Introduced FasterNet, which runs smoothly and universally fast on a variety of devices including GPUs, CPUs, and ARM processors.
• Conducted extensive experiments on various tasks and verified the high speed and effectiveness of our PConv and FasterNet.
1.1 Network structure
1.2 Performance comparison
2. YOLOv8 main steps replacement
Comparison before and after YOLOv8 network structure
Define FasterNet related classes
Add ultralytics/nn/modules/block.py
the following code block to FasterNet
the source code:
and ultralytics/nn/modules/block.py
add the following code at the top of the box:
Modify specified file
Add the following code to ultralytics/nn/modules/__init__.py
the file:
Import ultralytics/nn/tasks.py
the corresponding class name above and parse_model
add the following code in the parsing function:
elif m in [BasicStage]:
args.pop(1)
ultralytics/nn/tasks.py
Search in , self.model.modules()
locate the following code, and add the code content in the following box below:
Create a new file in ultralytics/cfg/models/v8
the folder yolov8-FasterNet.yaml
with the following content:
# Ultralytics [YOLO](/search?q=YOLO) 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPs
s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPs
m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPs
l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
# YOLOv8.0n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, PatchEmbed_FasterNet, [40, 4, 4]] # 0-P1/4
- [-1, 1, BasicStage, [40, 1]] # 1
- [-1, 1, PatchMerging_FasterNet, [80, 2, 2]] # 2-P2/8
- [-1, 2, BasicStage, [80, 1]] # 3-P3/16
- [-1, 1, PatchMerging_FasterNet, [160, 2, 2]] # 4
- [-1, 8, BasicStage, [160, 1]] # 5-P4/32
- [-1, 1, PatchMerging_FasterNet, [320, 2, 2]] # 6
- [-1, 2, BasicStage, [320, 1]] # 7
- [-1, 1, SPPF, [320, 5]] # 8
# YOLOv8.0n head
head:
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 5], 1, Concat, [1]] # cat backbone P4
- [-1, 1, C2f, [512]] # 11
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 3], 1, Concat, [1]] # cat backbone P3
- [-1, 1, C2f, [256]] # 14 (P3/8-small)
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 11], 1, Concat, [1]] # cat head P4
- [-1, 1, C2f, [512]] # 17 (P4/16-medium)
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 8], 1, Concat, [1]] # cat head P5
- [-1, 1, C2f, [1024]] # 20 (P5/32-large)
- [[14, 17, 20], 1, Detect, [nc]] # Detect(P3, P4, P5)
3. Load the configuration file and train
Load yolov8-BiLevelRoutingAttention.yaml
the configuration file and run train.py
the training code:
#coding:utf-8
from ultralytics import YOLO
if __name__ == '__main__':
model = YOLO('ultralytics/cfg/models/v8/yolov8-FasterNet.yaml')
model.load('yolov8n.pt') # loading pretrain weights
model.train(data='datasets/TomatoData/data.yaml', epochs=30, batch=4)
Pay attention to observe whether the printed network structure is modified normally, as shown in the figure below:
4. Model reasoning
After the model training is completed, we use the trained model to detect the image:
#coding:utf-8
from ultralytics import YOLO
import cv2
# Directory of models to be loaded
# path = 'models/best2.pt'
path = 'runs/detect/train/weights/best.pt'
# Address of the image to be tested
img_path = "TestFiles/Riped tomato_8.jpeg"
# Load the pre-trained model
# conf 0.25 object confidence threshold for detection
# iou 0.7 intersection over union (IoU) threshold for NMS
model = YOLO(path, task='detect')
# Detect images
results = model(img_path)
res = results[0].plot()
# res = cv2.resize(res,dsize=None,fx=2,fy=2,interpolation=cv2.INTER_LINEAR)
cv2.imshow("YOLOv8 Detection", res)
cv2.waitKey(0)
[Source code available for free]
In order for friends to learn and practice better, this article has packaged and uploaded all codes, sample data sets, papers and other related content for friends to learn from. How to get it:
Pay attention to the business card GZH below: [Axu Algorithm and Machine Learning], send [yolov8 improvement] to get it for free
Conclusion
If you have any suggestions or comments about this article, please leave a message in the comment area!
Friends who think it is good, thank you for liking, following and adding it to your favorites!